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Abstract 



Thin report rifWfibm rmiurh Fiuoqil jltivi yi-njiiix luhdw]. (lircrknl, Jifyrlie 
£ni|iln< wliU-h .ibntntL't repzucwutmlffllH Ilnnl in a variety of Artiflria] futc-lli- 

gcm-i' ApplkAtioxin. Flow Krri]i3is u^y In- derived fraiit jTr^-ir? grammars trnarli 

a* skiiigs majf Ise derivod friUjj sC 1-333J5 griii am; thin derlvAliiijj pfore&s 

forms aiw*fn1 hieh|i-L fc:f tile tftepwine n-mLeuieor pfofeaics twcd m ]^ro|;riitti- 
iiiinu; mid other elittlicrriiig dotiuiiiiE. 

Tlw Cellkfid result chf" thtF report ia n pjir^inj; algorithm fr^ flow RTJi[>ha. 
GifCH it flow graffiti iM? jielcI it Jlow ffrn|jli, tho Al^Htloii dHermmes wlirthpr 
the pamijj^tr (eiieni-lcn the graph 4md, if so. tlml* fill pwsikle derivations for 
it- The author Las Impl-cmem cd the Algorithm in L1SI'. 

The Intent of thin report 5* tit ttiaiV fhrwurnjih ikithhe Available ah An 
■analytic tool for roFCflirlidra Ui Aj-lih'rjjd lut elli^nee. Tke report explore 
the ciHuLliiNCLF: Whmd the pamifg Algorithm, contains niimcruiiH, extrusive, 
examples of tin behavior, aod nrovi/kv ?t>nn- gnidatne for those who wish to 
customize the algorithm to thflir own Uses, 
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Chapter 1 # 
Introduction 



This report mimnurizps research about fta-v graphs, a graph-based repre- 
sentation abstracted from those used jjj a variety rrf ArtifitiaJ Intelligence 
iippliffllirttu. A flow tfraph is a Infected, directed. arytUt graph whose ilodes 
are annotated with pork— position ah wliich edges enter or leave the Bode. 
Here is an example nf a flow graph: 




Wc can generate complex flow graphs from simple oops by replacing angle 
nodes with multi-node subgraphs, The cbvjcms analogy herwif « ibis process 
add that flf string derivation from a context-free Bramiwir gives Hue to the 
notion of a fhm grammar; a set of rewriting rules whji'h specify how to 
replace green nodes with pre-sp<rin>d subgraph*. Here is an example of a 
rule from a flow grammar: 



i ltitfttdnrtitm 




The centra] P^ult of tldp report i* a parsing algorithm for ffow graphs. 
Given a flow sr;miiiiar and a [fow graph- the algorithm dehTiuiurrs whether 
t|jc grammar gem-rat i 1 * tin' KrJiph and. if BO. find* all pck^iblp derivations for 
it. The algorithm runs in time polynomial in the number of nodes in tlie 
input graph, with an exponent and constant of proportionally determined 
by the input grammar. The author has implemented r])e algorithm in U£P, 

1*1, Motivation 

The- work deserihed here grew out of th.*" author a research into automated 
program analjais [Bratsky LflSl'.. cIl.uk- as part of the Programmer's Appren- 
tice project at the Artificial Intelligent Laboratory of the Mos*achus*tta 
Institute uf Technology [Rich and Water* JWl|. In the work of that group, 
programs aft- represented as annotated graphs, called plans, wlnwe nodes 
stand for operations and whose arcs indicate control and dataflow between 
the nodes. [Finn* arc additionally annotated with a great deal of Other 
information about the program they represent, hut the details of these an- 
notation* do not concern MS her*. Interested reader* should nonsuit [Rich 
19SU].) 

The author's idea was that tfof strpwise-refmem^r pn-ieess. wherein, high- 
level program operation* ore implemented as groups of lower-level opera' 
ti™$-. could naturally he [itodeM as a plan-rewriting process. Thus, flow 
graphs were developed as alstrEtftifMl^ of plan structure flkw grammars were 
developed tft eneode allowable derivation steps. flow-|rraph derivations were 
developed as models of plan JeriTatioi» n and structural program analysis 
could he effected throng]] parsing. 

Thin pmRTPUll- analysis wort w rontirmitif!, hut doc? not cwllCrm "a here. 
Flow graph*, while developed a* ad bee abarraetiojis of plant, are general 
riiough to ktw fia abstractions of the graphical representation* of other 
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ctmnaLru. Ttif intern (if llii* rrprtrt in let ttuikr flow ffru|di |>FLr<<iiis ^ulabir 

as an PHMlyrii- tout fnir AI nwFinhci^ in Uir-nc niWr u .lins. 



1.2. Background 

Tin- rttrtrtun 1 of flow ;nMnh* ami flow Mr-LNiuj^rK iu^ Lh-oil infliinm-i-d )..y <-. w \y 
Work mi surAi 3™™™^ [Pfaltz and hWiifdd IOCS: MijtiNmEiri HJ7^, PavlidtB 
1972], bur none «f this work wai fywrrruv*] with pairing. The structure at 
OWf pKfidn- alfpirithilu ,'moisc from MUyful (study of Earl^y'a aELjorilhin |Eark>y 
11M0| niii! D»j]rt](3 E. Kimth'a lamina] workiMi hH(fc) si ring graiiiuiJiM ;136jj. 

1.3. Structure of this Report 

Chapter 1 of this report Ls this intrcdnrtiun. Chapter 3 dp*rrib« flow 
graph*, flow grammara. ae:l! flow- graph derivations in detaj]. Chapter 3 
presents a derivation of Earicy'B alf;cvrit)jj]i wtikh diffnrs considerably from 
thouc found in standard wnrces. ThU derivations is given As background 
for the very similar derivation of the graphs parang algorithm presented 
Lei chapter 4. Flna]]y, chapter i discussi-* ilnw graphs, fiMisuiiara, and the 
f«A-!!Jjig algorithm. Thi* discussion inrlluJw a hrief rannpEejfity analysis of 
the algorithm, and dki^gestiuns for related research. 
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Chapter 2. 
Definitions 



:i. Hii* "iinr.t.r m- .ii-fim- *w it.,],]^ and ||. : ,. A - jn :;L;r ,, ir . , :lIj j Ri1Ml .j i( , 
mechanism by which a grammar denves 4 $raph. 

2*1. FJow Graphs 

A fl<?w graph is * labeled, acyclic, directed graph whose nodes and edges tor 
restricted in a v&m'ly of ways,; 

• The label of each node is called its fjrpr. 

• Eaeh node lias a set of utpiU jwrte ami 4i wt of flulpjjf ports. These two 
seta atc disjoint. Ml nodes with the same type ], AVL > the aam^ tttput aid 
output pyrt seta. 

• The input and output port sets of flow graph nodes are oe\o.r empty 
That is, att nodcis have at least one input and one OUtpnt port, 

• Edges in flow graphs do not nni merely fj-rjin one no.le to another, but 
from a particular output ptsrt uT anci nude to a particular input port of 
another No two edgca may enter or «tir Irom the same port : ma 4 node 
can l>e adjoiued by only a» many edge* as it has. ports. 

Intuitively, a fkiiv graph lonlc* like this: 
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Notice that ports (which are identified by numeric annotation!! on the nodes) 
BO«d not have edges adjoining, them. Any input- (or output) port in a flow 
gr;if>h. that Joes not haw on fd^e running into (wr (int. of) LL is called an 
mjjw? (01 attrpuf) of thai (Jr^ph- 



Notation. 

We w]3S always rlirf^t our flow-graph diagrams from left to right. Wo will 
often subscript node types ao as to makr them into unique labels, (This 
avoids awkward roll? tf Lie Lion* auch as "the third i from the bottom-left- 71 ) 
When *£■ do not- rare which port an odje ndjuxu- iw if thin is made ctcax 
from context, we will oiuit port annotations. If wO omit all the porta an- 
not at LOIl* on a node, wc will «ftrai omit the circle drawn around the node's 
label. FiMJdty, we will alway* ettaptiflflUe the inputs and outputs of graphs 
by adjoining ihem with edge stub*, tailed the leading and twtinif edget of 
the graph. 

Here is the graph Wfi flaw above wriltrn wring the couveuliuiifl just 4e- 
p-rihed: 



.2 Flaw fir.ij!iiiii\ra 




We will me tfrfl form whenever poBE-ible. 

Terminology 

The Unktyc mjormatwn for a node in a graph ii a sor of (port, edge^ pair* 

dtUai]iu K which edge adjoins ear]] port on tint node. For sample, figure !.l 
gtiows .% grpvph wW«* edges have been labeled (Vir easy rafrreucc. The linkage 
information for nodes a\ ami 53 in this £raph is: 



<■] 


*S 


!W 


^i«ft) 


<Vj> 


(t,er) 


fr**J 


(3,**) 



In kpcpin£ wjrh our Lrft-rn-righr convention*, that portion of a node* hnkap? 
information *liich involve Otdy inpttt (re*p. output) nigra is called its fcfl- 
iinkagt (rivp. n'jnrt-J'tidkflje ) information. 



2.2, Flow Grammars 

Flow grammar! are a KOiierauEation of cratcxt-frprtringfifJijiiiiiiaxBt Essen- 
tially, a flow grammar is a set of rewriting rdra. where each niLf explains 
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Tiguic 2,1 A Qnv cr^ipti- The oiigua of chj? graph havr been l&bdrd for cnsy 

huw It* replay a node in a graph with a particular nub- graph. Just as a 
s:nii^ grnunuor jjrailnalty rewrites a single- element $trJ5:K a* a Songer and 
longer Strirtg, & flow grammar gradually rewrites a tinft\r-:Uirtp graph as a 
larger and larger graph. 

More* precisely, a JW grammittr (7 const-it* of 4 parts:, a act P of prodvc 
tivru), two disjoin! sets of types jV— the rtOii^ermiWs— and T— the Icrmi' 
hula, Mid a lifting-.: -shed non- terulinal type S — the atari fj/p* of G. Each 
production in P conBista of three parts: two flow graphs and a list of port 
coixrapond^nTi i':::- firs: uf iIk 1 •■.'.■"■ Mow ^r:i;.:.':s the production 'i: isft- 
ho-n-d side- — consists of a single node whose type must he from N. The 
second of the fl"W graphs — the. right -hand side — Consists; of nodes whoss 
types are ten JV U T- The left and right-hand sides liWSC have the same 
number of inputs auJ outputs, and the list- of port corrrgpNflldrnces is a 1-1 
correspondence between inputs Mid outputs of the two sides, 

A (low grammar is shown ]n figure- 2 2 Each mk maps a .-ingle node to 
a graph. The left -hand >\\ie nodt: of each rule must be * non- tormina], that 
is, of a non- terminal type., white the right- hand rid*' graph fan mix types at 
will. {We will indicate tKHQ'tcrminal typea with capital letters, and terminal 
types with lower casc letters.) 

The; inputs of the left'h&nd side, of a rule correspond [itir-tr^oci-e with, the 
input e of the right-hand sidtf. op do the outputs Where clarity i? needed, 
we will indicated this rH-lalioti^lnp by drawing lines between (he edge stubs 
adjoining corresponding ports, as was dono above. Where it's cU'itr, however. 



2.2 Rjw Cmnuutiiv 




FipiTT 2.t. A HlW gffijjunaj. 



Hi 2 0* -SiuitMnf 

WP Wilt indicate the rdm-stniuOnii-c dimply liy luimnruiH ttn- iiliriliMlf'Hl of 

left-hand ?\<h- m)j;*- h*il1h with thorn: 1 of tlie rijjhtdjJUnt aiiU-. For cx.TiJj.plr. 
the ttenuid nde its t]jf ahoYi' ffrjiisaiainr could Liife bcm written JU billow*: 




Notice that there is no flaw*sranjnjar «|Uival«)t of an "n-ruLe" in a srihig 
grammar: that if, there are no flu* grammar rule* whose right-hand sides 
are empty. This ia hecasis* Lt "is ineajjijigh^ss to replace a node in a graph 
with nothili j: tiVcnlpefl that ware adjoined (o that node must go somewhere. 

2 + 3 + Flow Grammar Derivations 

Flaw graphs are derived from flow grammars in line expected way. We 
Btart with a graph consisting of a single 5-nod.c and then rewrite it with an 
applicable rule from thi grammar- This gives us a fiow graph. If then? are 
no non-terminalB iit tin: derived graph, the derivation atopa Otherwise, we 
pirfc a nan-terminal and a rule that derives it. and replace the non- ktlnuia] 
by rhe right-hand aide of the rule, This given us another graph, and the 
whole process iterates. 

Of course, when we replace a non- terminal by a right-hand fide that 
derive* it. We haw to do something; with the edges that adjoined that non- 
terminal. Thi» is what the port correspondent** in rules are for: if p was 
a port on She replaced uoiL-ternnnaL then the edge that adjoined p [if anj) 
is Jiaadc to adjoin p"s forreflponding port in the replacement graph. The 
rt'Htrirtions oci rule fonjinrirm insure that there is never any question as to 
how a right-hand side jshmjld replace a left-hand side. Fof example, figure 2.S 
shows the derivation of a graph from tlie grammar given in the last section. 
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FiglLrr 2.3. S™ph F'ijw GlApij Dorivnrjiu. 
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Chapter 3, 

Motivation for the Algorithm 



EarIey L s algorithm is a well-known string par-un;* aJgMrjtJiin |Earlej' 1Q69I. 
It taken a string grammar and a String as input, and drttTmines ail possible 
derivations yf that BfriuR from Unit grammar. The.' output of flic algorithm 
is a list of representations knowtL ft* rtemg: tbe acee.pi ability and derivations 
of tfca- input stnng are enemlcd ;:: rtiis Lint. 

This section presents a fWivfttitnj of Earley's algorithm that differ* fdg- 
nificant]y firtUB tlirjsr found in standard sources. For agivrn input grammar 
and string, we Erst construct, a tion-dctcrmituBtjc stack- based parser for tli-e- 
grammar, We then dcCErnisisistic^Uy annulate the twhiivi^r of that parser 
when run On the input string; the? representations of the parser's config- 
urations generated tn this (Hnmtotion wd) be nomomorphic to the Jtems- 
produced by Earlr/t algorithm when nan o-n the satn* input, 

Tho derivation jfivra lucre is presented as background for the very simitar 
derivation of our flew graph paraing algorithm given in tbe next chapter. 
Much of the complexity inherent in boil] algorithms ariaea from optimisa- 
tions that arc employe] in the simulation process: ante the intuition* under- 
lying those crptimization* arc the same in both the string and graph cases, 
we believe- that presenting then] in the relatively familiar context of utring 
parsing will make- their Mse iil graph parM::sg ijji>ri.- lomprehensilde 

3.1* Non- Deterministic String Parsers 

Given a context free anvmmar C with productions Pi P„ and start 

symbol S, the following construction yin-lds a nan-dutenujuijilir stark-baaed 
parser for Q: 

13 
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J. Otmstrnct n ittale-tniwliilH' IWUgIli*rr for the n^H-ltniid HiiV uf cnrh P,. 
A HtJiri- Lit tin 1 recognizer J? T ^utulnictcd ftw rule J^ will emJijiii't <if n eopy 
of f/n ri^ht-laud ride with n d«t plarrtl juHt ti> tin- li-ft or right i»f mif or 
its symbol*;. the*t4We «"t ikF J2,: will ("Iminl ofAll Hir state* formed in thin 
way. The state tradition Eiincliou of if, will hutf) fctftfe, symbol} pairs- 
to rtatcs: each etatr wit] j a Jut tit Ehr Irfi of jwoit symbol ,* will hare- a 
transition on a totbesftatir whflfle dot isjunr to tLn*- rig]]! of a. Th<? initial 
state* of /c, will br- the state with a dot to the left of I lie leftmost ayinbol 
in F,"$ rigbf-hand side; its JujaL (Areepting) atatp will be tic atate whore 
slot is to the right of Hit* rightiiivst ,^j- ulLh 1] m J?, "a right-hand aid*. 
For example. If Fi i* trjie production A — xBAy. then tlv> ieco.grtiEOF for 
P t will have the following five Btatpfl: 

[j* -* xBA-jt\ 
[A-> xBAy*} 

ar.ul the triiiuitior. diagram for P,$ rcrosnisflr would look as follows; 




il h*| IU11 



Create* a state-based JUEnC]3i31C P whoHS state apace aud IracL^ition fn nation 
is the «nton of all those of the retPSni&erB for the P t . The inif -a.1 anri final 
stales Of F arf tbfc initial aud final stat^ of the rccosniEer for S. 
Convert F to n iion-dptermhustic alack machine by adding 1 a alack and 
inatructions ns follows: For eaeb state ,1 which lies* A transition on a non- 
terminal input, jtsauciate.d instruction!: !■■■ 'I: fit *?tAte which (i) push the 
hta-.c out a the stack and (ii) pot P into the slatt BtftU 1 of the recognizer for 
SOIIH? product icm. which derives that nun-terminal. (If the non-termnial 
Oil which a state had a transition has rt po^ble derivations, then this 
step Wilt associate n inatniciious with that state.) 

Complete P hy adding instructions as Mlowa: To «ich accepting alalc of 
a rrcogniier for A /\, add an instruct ion *'hifh (i) pops a at rate off the top 
of the at ark am* (ii) put P into the State which \a led to by the popped 
statr's trn.jj--.ir inn nu tin? non- terminal derived hy Pi- 



J.J MjJr-PrffTJJJJJjj^H- String PiUVt-Tr 

r™ -in™ am fr,,,,, c,v it (I| „. ratw hy n , uiillR flyilllm , rt iJlhl a( a 



JUI :.- J r- 



Wof tb** mi*™*™ «nd iwruriv if. (The rh„ic, iijw.1™! l»flrr is wlrt 
ttrfm tlu* piir^T ninHl,t„iuiwrfir.) W, h„t ,.„ ltlii(](T rtU raJU[1|)fc , ^ ¥U ,, 
a P««t. a«<J tlim SuntM whip io»fif«fckiq B of tk pHWtTOrJion bdmlque 

3.1. l r An Example 

Consider Um fullowitijj grammar G- 



S —> Aa 



G derives all attiiiEB existing tf UM or morG cV falWd by as «t, We will 
cany unl the c-wwtnirtjan described above HD as to prod UC c a psj^cr for tf 
and then run tUi parHer on the input «0. 

FirtL we coortnut state madiiru-a wllich rocfiEnue &xh ot the prodyo 
tiitns in G. These are as follows; 






](•■• 



3 Mat-ivatitiu iff tin- dJj?»ritJiiJj 




New we crratc tin- union uwiiHie and n-|i]?icf (kOfi-twuiijuil transitions with 
pushes: 



,"5. 1 NuirDvtr.Tiiuuiytie Utriug Pnrwt? 



IT 



S^Aa = ^ 



A ^C: {7-~ 



A— cA 




Finally, wp compete the construction bj adding stack pop* on ledLLftiunij; 



S— ~Aa : fs — 



A — - c 




A— -cA j\ t a— ^ 



r"P 
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Tkin ri>njpM™ cmr p.-ir^L-t. We w]]J rrpn'iirnt a (pYni "i^iJiKiir-Lt Uns of \hr 

;.;,i:mt ah 

(input iK*itiini) |(*ih^J^{Htjii'|( let}*); . . . : (start bottom) )] 

w|lctp the stale* aw P.'piVi'fliti'il mini; flu." 1 dot represent at km thvwn abuW, 

(IIi'L-rJ] that slack ■ -5 1 s ri ■ — .ir; just r:;i'..-.i [■"■.,■.- >■>:,. ::.ru-. viicti r;::::::^-; lln*. 
parser on the string era. it start* in (lie following configuration: 

0(c) [S~*Aa,{)\ 

TIip at ate [S —> • An', has rwo t ran sit ion* on jm.ift LtLStrurtions, The parser 
mu»l ehiitijie otic* of tli* two. leading it into one of these two configurationa: 

0(f) \A ^ ■<:<{$ ^- An)] 

At tins point., no muff fttatp trJuiditions arc* possible without (fading an 
input symbol, Thug, the parser wilt read the first c. Icrvliuj; it into one of 
three configurations; 

1(c) [A-*c;[8 — -Ab)[ 

The firEt of these two confisurfitionB i* *" accepting state for the rule 4 -» *, 
And allows a pop into the [ulkifcinlg configuration: 

1(e) |S-> 4 .a,0] 

vhite the second configuration if in * state, containing psuh transition* to 

ill!- ?!.■"!-: i.. UlT.^Ur.l I Kl'.f 

1(c) |j! ^■(j^-cAJ-t -M)] 
\A ^-cA.{A^c-A\S^-Aa}\ 

Once again, no more state tranditionu art possible without reading another 
input jyiubol- 

We ran sii nnnflTiw all tin.* possible c-OEupUtat.iOflS » fai in the following: 
tabular fashion: 

0(4 [J- A ■*,()] 

|rf -■•:,($ -i-jio)] 
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J(r] [A^c:iS^*AiL)\ 

[A -► e- A, (8 -+-A*)] 
\A -» *u,(A -» e -A; tf -* . 4 ft j| 
■{A-wA.. (A — fAzf? — -Aa)] 

We will use this form intensively to summamr aitione of tbciw [Ninwti; for 

example, the remainder of the run of this parser on tin- string ^ goes as 



fol] 



owe: 



2(«) \A-+e'AA^c-A\$^.Aa): 
\A^f.A-,{S->-A*\] 
ft -A -«,{)] 

[-l-ri,(4-(-4:S^ ■ A. a )] 
[A^ -c,fA — cA;A-»e.A;S-i -jfa)] 
|A -» .eA,(A-*8-Ai-i c-A;5-* Ag)j 

3(«u) [S-.Ao-.rj] 

3.1.2, Diacuagion 

From OIK- point of view, this con5tn:etir>Ei technique produce* da»aic recursive 
descent parsem, such M those presenti-d in undergraduate c ornpifor clamt, 
Where 4 rpcursiTe-dpftt tut parser would have a subroutine dedicated to the 
racofiniriun of each rule's right-hand side, the* parsers have state-machine 
rewpiilera, and these recognizers are linhed together via a K Buhrflutine- 
caLP mechanism ba*ed on a Htac.lt. In what follow*, we will often destrih-E 
the actioQH of these parsers using terminology suggesfei! by this metaphor. 
From soother point of view, this <rnjjsr.ru,: tion t^rhiuque produces claa- 
Bic push- dew n automat&ta- Thr state-Wed machine* constructed for eBch 
grammar rule are finite-it aU- mrogniaers ft* the right, hand sides of those 
iuLls. and the dots in their states indicate the e*n«-t«l position of a read 
bend in the parser's input In tins contest, the stack fundi and pop instruc- 
tions art a* c -trandt icms hetwpen the Vnrinnts rec« K nijiers. aild the parser 
appears m a non-detrnnirdstic push- down antonintun wW finite state con- 
trol compjtre* smVnritigx of the input against the light-hand side of gram- 
mar rilks and whose at-nrk monitors the fejiter-rtuhrtldednrn^ of the input 
as a whole. In what follow*, m wiU also u*e tarnxmnfogy Miggeated by this 
iKk-r-aphor. 



20 3 Motivation fur th<- Algorithm 

3.2. Simulating the State-based Parser 

ItJ t5F(kr I" sittLilijLCl- ;1 paTH(T OllllStnirtcd E1S ftlj<W{', WH' ffillfft iHTP'TItt fljl 

rh« % wijuri* ta'hh'lj folLiiw fnim alJ puttilile mm-dcterministic rhuire*. The 
reenrsBve-uewtTnl in^tophuj 1 sn^estx that wl do thi* with a sequential a[t- 
proaeh that emphiyn iwrfctrflrlattg. mhlk* titf automaton metaphor suRseaLs 
a parallel approach in which nine simulator -state represents a number of 
mirhabLo parser stntfss. We shnl! ndupt r li |m latfer approach. And kc^p trar k 
o: aIL the i-^atr. sta--k! pahrt Teachable hj 1 ft pATswr at each St4»p of the in- 
put.. The result of the siunilaEioii will be a ^iinence «f list? of rcvh&ti]? 
frinEl(j*iTjiEicnis, mneh like those iiacd in the sample paree above. 

3,2,1- Preliminaries 

We i»« here a slight Lj 1 difTi-reat representation for the stark segment of a 
conspiration than w«- did ID the sample parse abore. In line with our 
subroutinr-call point uf vie* 1 nn push operations, we will not keep the whole 
atacfc with each configuration. Hat her, each time we Riato' a transition to 
the initial etate of a recoEniwTu we will keep a reiur-n pamttr whic h indicatea 
tin? configuration we were in before entering that SLAte. 

For example, wr- presented ahovr the canflgutttfion *qu«JCe for the parse 
cf cctt, If we make ttLC n-prrecutitlicmni ctuwige* J1.1-S' dwrihH, we obtain 
the following, more compact repre«7ntat ion . in which we have subscripted 
tin' cim figuration* for use in return pointers: 

0{c) [5->-Aa, ]j 

J /I — t ■ €■, l]s ;expand A from item 1 

1 5 — * *>i ■ a, |s ireturn to item 1 
[A-*eA,l] 9 

[X-'(,6] T 
[j4 — * ■ cj4, 6] a 

2{cc) [A^e^G] 9 

[A^cA<,l]n 
[S ->■ A - a, |u 
\A^cA,^]it 
[j4^w,12]u 



.T 2 Smmhit itiK * hi ■ State, fai ml rtimT 2 1 

[A - ^1. 12|j4 
i(ccn} [S -Aa: ] itrrrp} 



WV mil IJj,* |{,.l«irr), {return |><>LnT,T)] pnjnt iivi-tl ]w rtrmj. r<> riirtmguid, 
tlbciN from rimJigiuritiDii rrpnwiirfttjaii* tjj.il aLcmr (Ik- ™mp]^ stark.' 



3,3.2. Multiple-Call Collapaln 



f? 



It \* ftmTcni^jl to think of r tixa mrthod aa simulating. Hot am*, but man;, 
noti-detcimioutic pam-rs at tli* same timt A* thisr pram run. tht-y 
make ditfPMH decisions at uarh rhuin? point, ^d r j ie ainmiation k « pft track 
of all Lbn- Jiiforcftt c™J3 E « ra tiun s thry get into, At any poritioa ilk the 

input, thi- current $Ut e of jmy ( j VM pMtf!r h ™ nl wn<tl in snjne item oil the 
current ik™ ]i d t : and the contorts of that parser s stack cuaj be computed 
by fnl Lowing return pointers from that item upwarda. 

It may happen, hawroT> that t.Wi> parser* whom- at,uks differ enter the 
tttw state at the same p OB itici, lit the input. For example, ^rjsider the 
following grammar G: 



A-*t 

A^cA 



G ikrivmi the sjmjf strings as (he grammar G gtvfl, abnvc. However, if G 
derive* a riling via derivation tree J\ then G df rives it via the following 
two trees: 



'TlifirirlBiLoiiiliip «atb Eirlff itnen la amiined trJow. 



11 
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Tree 1 : 



Tree 2 



S" 

! 



The paniHe! alnurturf <4 ihfae trees can W snii rlv*tt\y in the following 
sm,i]]jitio:n of a parse of e&t under C 1 ; 



<>i0 



1(c) 



*(«) 



\A -t ■ <\ 2}j 

[£" — £\5]l 
[J'-f-Aa.SlT 
[jt-» ^,7li 



[*■ 
[S* 
!*■ 
|4- 

[A- 
|* 



► A-a,lJu 

■<r, 12|u 
■uA, lilu. 

f,7]ii 



|4 -c.12]hi 
|j4 -* cA-,2]n 
[8* — A- a, 1] B 
[A-*cA,tt]n 



[compare- with item 2 



:roninarr iUiUs 1&-113 with items 10-14 
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[A-'i:A23| B 

|j*-*«!-.17| m 

[4 -» 'cA M[u 

Iff-**-, 1» 

The possible con Figurations obtained upr«n reading eadi of the input rotm* 
break cleanly into turn groups whoso state trauaiiions arc- Identical but wLr^i; 
stuck environment U different. Eaxh group can be thought of as fnnttULuitf 
the configurations of a different parser— one predicting the derivation that 
Starts S ~* S r and the tjther predicting the derivation that starts S -» A'* -* 
5', The similarity between the two groups is a corollary of the bet thai our 
grammar is conteat-frw. In bot}i casus wo arc awing thi* transitions involved 
in the Leftmost derivation of cca from S 4 ; these transit-kms. must remain the 
same regarding uf the context <\if A'' Lu the derivation, 

The kT observation here is that, when tile retoRniser for a pvm rule 
is. called. (Ac 3tnriin$ jsontfon of thai tttQffmzcT in th( inyvt compltUlii *ic- 
(emq^M iti itehavitir. A particular recognizer may be railed from parsers 
With a variety of sladi configurations, but if all the rails wcur at the aaiue 
input position, We need only aimulAtc the stntr tT;L:.^iti<jtLp: [node by that 
recognizer orn-p: the results can t.bfJl be wed in atf the parsers that made 
the smudtaiirtiiLLfl raits, 

With our representation, thi* ontimijation is easily uhJg by turning 
multiple colls to the saute recognitor at the whit input pwitirm into a sangle 
call with multiple ret urns. This leads to the following par?e Eist for ma under 
C (each item now contains a set of return pointers instead of jiqjt a single 
nile): 

&£f] f5^.S',{}ji 



u 



[S , -»-^Avjh3}|i 

[A ^-r. {*}];, 

l[c) |A-c-.{i}}j 

\S* - A-a,{l*3}]i 

|A-»-M.{Q}]u 

[A^c-A.{Q}U 
[A~'n.{L,S}\u 

3[« a ] [5'-*1-,{1.3}]h 



.I'iiuj] 



J jMutivjil'irtfj fur ?Jdf -AJ^ri! Jnjj 



jjirp w3Hi ilcuii' 2 ami 7 JiltLwe 



return- to item J, accept 
return to item 3, - ■ - 

. . . accept 



A uswtful way to ccnicrptualH? tlir nptimdjHrtion prtformpct here is to visaaJiie 
the pflTfl* 1 trevs "twill" tsfy ttif puahfta and pop* of tlip vnrimiB parsers bring 
KilMuLsutfld. Ik-fure tin- nipt iiiiwaLion. the simulator built both of the correct 
ilirivation tr^ea: 



3,2 SintaJntuig tlu> Stxtr-hiuwt} Pamir 



^. r i 



S' 

/\ 

c A 

I 

c 



S 

I 
S" 

I 



/\ 



/ 1 

c 



Aft<r thr optlttiizatimi, it Laiicts, Lh<- following hybrid structure: 



s 



s: 



/ 



S" 



\ 



/\ 



The latter rtrwtun: contains the sunn iiifumuitiwj about [Le pawt « the 

two pTvvions tr«s together. 



2C & MoiWaivMi fat tin- Algorithm 

3,2.3. Left-recursion 

AlJ i ttJ] ■< irt-MLt sisL^-cnsa* tEL wltii'li mniHijIr-rEiIl rollEipsiLiL!, i* njijib^tblr is rli;t1 
uF m-r^'iir^ivi' ^rJLitittLiiT ndt'-H. Tim-*** tnkm i^rc^isj wrll-kcuwit iLLIFLniLtifp 
fur dclL'njjJjjiwtiL- rni"-LLr=*ivL^-i3i4»rf>iit pjUfyTS. brriiiitt- t\n> |jeit;ht can not knew 
bow many tiiiirn d* biV^to'Tht-nTiirjfivf'CixFiJUiweiiJ ut ji mm- tormina] witbunt 
fookmg a]jisnl iu tb<' inpi.ir For cXftru[>lL j . ftmiJ'ubr Liu* following gramma*: 

A -» c 
,4-* Ac 

This ejismmax derives exactly the (am*. 1 ttxfelgS &s the rishT-rnvnTiiLVfl .grani- 
nuif £ given above, but ronduVr tliir luLlnwtng "parse* of the input string 
cCtt [\K haw not Uiiciil □luhipJe-c-aU collapsing); 

[j4 — * - it. 1 1 s ;expand ,4 front itciu 1 

[A -* ■ j*C 1 U i^ tto 

[i4 — * -e, 3|< ;expand A from item 3 [uh ob) 

[A — • ■ e. 5-| a iend so on 

[A — • -i4e K 4|7 

[ji-* j 'C f 7|( ;«ndl30"ao--- 

Ifr) |Ai«-,l]w«-i 

[5 -» /I ■ a, ] Mt s ;retura to item 1 

[-* — e -, 3| nn-B 

■A — A j tt, 1|»*< jrfttLirn to item 3 

A -» it -,51m, a 

[A — A - 17, 3|co,* jetitrn bo item 5 {uh ob) 

|A-t A-t^S]*^ ;and #0- <H3 - - . 

- 

(A — Ac-, 3J rt 4-*+3 

[A — A-t, llrt^a,^ 



3-2 Simulating tin' Htutr-lumtl Poiwr 

\A -* At ■, B|(„ , qo , j 
| A - * A ' f . 3| cti it . a 



If wf perform inuLtipli— :-.-lII i-oLIevpklqb . howprcr. frnnethmj; very iii[rrcKling 
happens: 



»M 



i(*i 



3(«) 



icspand item from items l and 3 
™>te the ae^-rtrurBion here 

;r?tum to item I .,. 
; and flfffli'u to item l! 



M--c.{l.3}|i 
[A -» -^{1,1)1, 

\A->Ac;{l,zy m1 
\S-*A-a,{]] t 
\A -f A-i%{L 3}|( 

3 (tea} [ff-*Afl'.{H, 

Thfi subtlety here involves Lbc-m 3, which serve* the same purpose ju items 3, 
5, 1, tt . .., in the prcivkura simulation. Wo are m fuel sdniulatsns .in mfinite 
muroha- of pamiB here, qui? predicting each of the following parse treea; 





A > 


» 


j / 




■ 






..- J 


> 


I 


d 




4 






fr -I 4 


/j 


■ 








1 








1 


■ 




1 


G = 


■ 


{ C 1 


r 


■ 




e 4 


: < 


■ 



At any given point part the first i: In the input, however, they have a]] inv&ked 
the same rwojpdscr (for A — Aa) at the Hauie point, » the einiiilation bttpJ 
jiiflt onti representation for a]] of them. 

3.2 L i, Duplicate- Item Merging 

MultLpLci-eall fujllapsanfi optimises tlir case where dlffcTcnf parsers invoke 
the same rmpjnmT at the smuc point in the input. If we coitsiniiT otily 
unamhignons grammar, this is. the only case in which nn'OgniatTa invoked by 



liHVmir |j c ttslt* atv ^hnuit-c-d iti ^crfniruj idi'ntiL-rd jutiimx- But nutwriiT 

HtL' fii)]t*wiii£ ilihJmjJluiii* gr;iiumiir fn-jjj uu-Lir : 

i? ^ A. j j a 

B-*i 

B -to 

Tlii? fragRj^nt piftnlUCT? the following two dcrivaLicnj? of ttw atling fragment 

A' A 



b c 



Theae derivations aj* recovtfrd hi the follcwiciE parae: 

i(»] (f^MDk 
[A-,£T ^ h {}) t 

[C^-&t,{S}l fl 

[<?--*{&Ht 

!{!*>) |C-*b- c .{5}] B 
[fl-»*,{l}]ii 

'.-I ■ B C. • \\ J :.rc,!:i[i.-Lt-L' with ;'fii, !? 

|C-^.{U}]« 

|C-*-t,{ll}| ]S 

3|bfe) |C7-»fe>,{5}| u 
|4-»£<7<.{}| ia 
[C-c-,{ll}j je 
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|-4 — * fif'-.{) 17 xcunpare w i 1 1 1 i1e]jj 14 

Knrfi- I li^E iEnbH Ifi and L7 axv hirh He,ll. The sit nation enr^n/ireni] Iutc 
l?( unite .■iinsiliir U» tmtf in whirli wr invoke untlt i|i|l'-h all col lapsing- in I liar 
nTOgniJcrs Fur tin- a-«ij£' rile invoked uimiltJln«uiii«]jf by dilforerit jutrHer* 
li.-LVr nwh^u the name state at Tljesuujjc point in the input. In this r.-isr (bat 
tfalvis not the rrcognijK'r*' initial stale, but thr same reasoning Anmz that 
both revugnuere will perform kfcltf iral net ions until tlii-dr n^pcrtrte parsers 
make differing cJoFivftttLH) deriuout. Tims, as with tmillifjle-ra]] collapsing, 
we need only keep <me itcin to reprint, the stole- of both recognizers,. 

3.2.3- The String Algorithm 

Wr arc now ready to start- imr string parsing algorithm The algorithm 
takes as Input a grammar G"and a string a. and determines whether G 
generates a. TSie output f>f t]ic algorithm is a teqotDCt! of item Jtnts — oae 
for each. symbol in a — whirh represent all tLe ronhgu rations reachable, by a 
non-deUTTturiL&tjc string parser for t? operating dm s. The algorithm does 
not eonstruet a parse tree for the input, hilt Wt show below how it csn easily 
he modified to construct all possible ones, 

The algorithm operate* t>y uring a h*t of it^nis /, ly krap track of alj 
the configurations a parser might be in after reading the t-tb input symbol. 
Given lists /(j, ..., /,„ ;% the algorithm construct^ li»t I % by using three 
operations: a Marnijf operation, a prcdicrsr operation, and a iomp/eier 
operation, We first describe the nature of these operations, and then how 
the algofitl'Lu. usi-s them to construct the lists Jq, /|, , , ,, I n . 

The Scanritr 

The scanner operation cakes as input an item, i from list /, i and the j-th 
input symbol ay. Let * be the state part of ■ and r it* wl of return pointers . 
If a has no transition on a,-, then tlie scanner (toes uollimg. Otherwi^, j 
has a transition on ay to some state. .s r , and the scanner creates an item i' 
on Lisc Jj wl]«^f state part is *' and whi«&' litf't of return pointers is r. 

We can abbreviate the scanner operation on follows: Let f A — a • tff,-r\ 
be an item from I t .i- Jit i» the j'-th symbol of the input string, then the 
scanner adds the item \A -+ at- ft, r] to Ij. 

3 The annsrs, 4>l Shcwr afir-utsiMi) nt liaJsru frugu 'EarLty lOfltij . 



M 3 Mut'tYatiiHl far tfw Aliptritiim 

The Predictor 

Tin 1 pmllVtitr i>|h-tjl(](J[l takes jlh input an item i from iii 1 1 /,. If Hie state 
purl h of i <]o<^ ]jtir haw a transition on a in i-]j-1 i.tj njijjlL rnn.Lf_ thru, (he 
pndietor operatic (l(n-n tn>l hirm OHuTw-ise. h't -4 he thr noii-terminm on 

wljiHi ffLjih a transition, nnd let ?j , ?i -i*., In' the iuitial states ofall (Ik- 

reco^fLifcers for rules which derive A Fw each #,. t]je predintur si-]HTation 
clucks Hn w 1 * 1 if tl^f* is an item wit It state part #j ou list /j. If «o, the 
predictor nldds l TO the- return **t of that item. If noL The prcdiclor rreates 
ah item with state part *t and return act {1} mid adds it to J,-. 

We «'ftti fiLhthrir-viAt c- the predictor operation a* follow*: Let \A — * rt ■ i?^f, r]i 
be iiel LEL-nt on /j. For jiIL rules. J3 -» /? in <7. the predictor operation »earrhes 
Ij for iUl Steffi of the form \& -* •fi.r]. If it find* One. it adult 1 to T. 
Otherwise, it odds an item [JJ,— * r 5 1 {»}] to J,-, 

The Completer 

The completer operation takes as input an item i Ofl list /j. If the state 
part of i is not the acceptinf; state of a recognizer for sonic rule of G, the 
completer epcratiuu doe.* tinthiue,. Otherwise, let A be the nOlftjermitiaJ 

derived by the accepted nile, and Set t, i m he the members of the 

return set of i. The slate parr of ea*-h t, must have a Iransitinu on A lei .i] , 
..., 4m be the stales led to by tho^e transitions. For each if. the rulitpleler 
looks for an item on Ij whwc State part is &i an J whom? return ucl is that 
of i' d . If it finds one. it does nothing, otherwise it adds such an item to Ij, 

The completer operation may he abbreviated as follow*: Let [A — ^ T ^ri] 
be an item ill Ij- For each Item B -+ U'^/^rj], such that 1 € ^i, add 
[B —mA-fi. rj] to I j if it ii not already there. 



The Algorithm 

First, we wiltfruet /q as follows: 

1. Let *i, "-1 Jnq be the inilial stfMes of recognizers for the rule* in G 
w]iich derive S, For each #f, add an item to la vdiouc state is in and 
wliih^f return set is empty. 

2. Complete Jo hy running the picdictor <m everj 1 itetii in it. If new items 
are added Lo it, run the predictor on them, and repeat this until no new 
items arc added. 



Nrxt. we^im-wiMy nmrtriKl /j /,. Giwn /„ / „ Wf r, ]|ll4mcf 

3. Run l\w xtniincr iWt cvktv Hc-iit in /j.j. 

4. Run Hi* ttuuplrtcr .hht pv^jr j tnH hl /jV jj t] , js 4ld j H ]|f , w ^^ ^ , ^ 
the eomplrfer <w them, aid repeat Eltin until no new it emu uo added. 

5. Hun the prtthctor avrr every item in Jj, If this add* new j (cmB to /,. run 
the jitter ov«r tbcm, Md rcp«*t this until no new items IT* added. 

A little thought will * mv ji,r, tllffl reader that this is indeed thfi aJ^r^thm 

turd to produce the frta show.. jiIjoto. A string is areepted ty t hL B algorithm 
If /„ contain, ^ item whMe ^ t1|ft] Mt i? pQ1J]ty ftfjJ wLoH[V -late ^ , fl ^^ 

Accepting *f ate of a recognizer, for a rule deriving S. 

3.2.6, Why is this Barley'* Algorithm 

The algorithm described above does not appear, p,i ma fattc, to be Bark^a 
algorithm. Tie apparent difference Is due to a couple. f factor., both of 
winch we iT amiw j htre. 

Abbreviation af Return Pointers 

Oux algorithm ub« it*±n= of tJae form \A -* a ■ fi t r] t where r is a get of 
return pointers. Eark-y's item:. have the farm \A - tt-0,i|. where /, [ t ihe 
number of Input Bymbols read when the A lecogaiacr wad fat invoked. (Of 
tourae, at thaL time, the recognizer Wft- represented by an item of the form 

These representations seem unrelated; lnwrn. sonte thought rtWa 
that we ran enen.fe our representation ut Earify'a form. An item of the 
form \A - arj, When added to lM /,-, represent* .i call on one of A't 
recoenm-rs at point i in the inpnt. Th«E. the callers of » w h an item- th* 
members of r— uiiiHt be alE the items for rwoflnizrre wllirb rapwt to see an 
X ;jj point i in The iaput- But ttW items ar^rcaftty all those on ft which 
luw* an .4 to the right of their dot. Thus, if an i.tL-etL uf th« form {A -» ■ cr, r] 
rtp^ears on /», r mutf ooiisisr of Mtncrly tJjo^ itciny on ft that have an ^ to 
the rijht of thdr dot, so we ran pjjrode r with tlie LnU'jfrr i. 



\Y1 3 Mtrtiviitum far tin- Algt^rithm 

HiiciiJLiiLj; of i -rules 

Earley V algorithm liaiidli* ujftiliuu;^ with production oftbo form A -* t* 
Tlii> invulws n is tin" inoiuplr'lrr oh J<k and altmHiiiug tin* ropLiatod 

applirAtiott (if stop* ^ Find 5 (iii^tcFiil hkT applying trtw - rcpi'it^Hy JLti.il then 

tin- OtblCT}, 

If thro* .HtcjiH wen' addi-d to our algorithm, and if I'lto representation were 
chaiigrd as mentioned above, onr algorithm description would aKT £ * esaetly 
with the dcseriptiuil givm by Ear-ley in |Abo and UUsnan 1972] . 

3.2. T. Using the Algorithm to produce Farae Treea 

Tin? algorithm we haw presented hero is actually ac& acceptor, rmt a parser. 
That in, wliilc it* output indicates immediately whether or not the iupDt 
string in Lil the language of the input grammar, it duos nol provide a parte 
tree, 

Algorttlims Arc .iv.ii 1 2.':>.c (rem a varxtv nf ^im^ ; - .,.. 'Aho and f.T E t - 
man 1972]) whieh produce a parse trac from the pa*3t lusts output by OUT 
algorithm- bl addition, consider the following definition* of the- sinner and 
completer operations: 

The Compktcr 

Tbt completer operation takes as input an itttn i OKI list /j. If the state 
part of i is not the accepting state of a rerogtiii^f for Wine, rule of (t. (be 
completer operation does nothing. Otherwise, let A be tjje non-terminal 

deriyed by the accepted rule, and let ij. i„ be tile members of the 

return set of t", Thi j state part of each t, ]mi5t, have a transition on A; let dj, 
. ... Sp, be the- states led to b} 1 those transition:?. For each i t . the predictor 
looks fnr an item on /j whose State pftTt is fl s and whose rctmnL *K L* that- yf 
i,. If it finds one, it adds to it a pointer to t and a point or to I;. otherwise 
it adds such an rtem (including (hear pointers]! to Ij. 

The Scantier 

The- scanner operation takes aa inpat an item i from list I J _ i and tlie j-th 



N t}ur RlflDTLtJuij need ndl Jiui.nlk iIm-v- [if<MlurLiaui : iiifir ruu biLvn-riii] caJf in iJJBW- 
fJi^iDjj it ta p^]jIl prmuiiiLiJ!! (in w]lieh Siw-h pnHtiHtiraia cau not afcor). 
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in|»i1 sy i ill mi] ttj. let * hv tin- hIjlTo fkjirf »f 1" anil r ita m ( <Tf rrhin] [n>iutf?H. 
If ,* Lmn urt tarnation mi t. lln-ti tkr siiuttiT iIcmt* inching. Dthirwitfa-. * 

lu» fi (rail nil iuri. nil J Nk mnuc i*I ill i.' /. Jitnl 1 \n- m-iimuT m-afw jul jtnjj 1 1 -cbi 
li^' '> wlmsM 1 shitn- [Kitrl is; f J . wlj5-m- ]]sl nf ri-Msnj [mmi™ i* r, ,'itn] wlmh 
ruutjiiit* *])£■ srvrch- nujiplficr-r'uMi'ri points jim 1 [if Hiiy). 

If flic nl^arithm n-st'H t(in-j(L- ilcAltitidUj. tsicli itrm of t]ic form [A -+ tt',r| 
in t]jc e™*(n]ptfJ Iwts will be thf irtirt of a [HHhW ^.hifturc giving all 
1he derivation twec-a far that Wfujce nf 4 in r]ie input. In partic-ular. if A. 
ionh?iico U waited by tit' algnrith]]]. tin? id; Ilia of thr form [J? ^ ft ',{]-] 
tuj In Will bt- the mote of a]] thf: OwvAliott trees for t]mt *?titCIKC. 
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Chapter 4. 

The Algorithm 



In tjjj* chapter wo present our flyw graph parsing algorithm. The inputs to 
E fW Algorithm arc a (low grAjuinor aud anjw graph: its output is a sequence 
Of [lists anular to the item Uati produced by E;if ley's algorithm- 

As iai the last chapter, wp will produce the algorithm by developing 4 
non-dtfermraistic parn<ir and (Ilea -simulating its behavior dcterm jnist tcaJJy. 
Roth, the parser and the sirilulaliou technique generalise thoee. we i.iSi-J fur 
strings: the resulting algorithm is a generah^twu in that, when, it is run on 
a stnng graph, it performa a superset of the fictions performed hv our string 
algorithm. 



4.1. Nor- Deterministic Graph Parsers 

The method we m.-i :l l.:i '.-i;ijs:n]-t .1 ?.\vmt for a stnng j»t.l:-.-.-..-li- . i-:-^re ; | 
eBEcntiAily of two Stupe.; 

1. Construct recogniier* for tit right- hand aide* of each of the grammar's 
productions. 

2. Construct a Stack- based machine out of I hew recognizer* liv replacing 
their noil- terminal recognition step* with 'subroutine calls* on other ret- 

We will apply this same method to ffnw grauLinisis in order to cons! met 
How graph parser*., "The mature of this construction is detflrmiiied by our 
general] Batiks of (i) the mechanism nsrd to read the parser's input. |ii) 
the rceoguiucrg used for tljr right-hand side* of grnrmntfr rides, and fljlj the 
linkage unnrhauism used to interconnect recognizers- Em-h of thenc gem'raU 
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*G ■* The Aijtvrittnu 

iXAlhUis* preserve hit tuition* ttwl ariH<' in (Ik- drills '"'^■, 1i«1 jihrjLHnn tlitw 
irLluitjuiiH Hit jy to make (hem npjilaVaUk trs j;raphn a* we]] a* wtrmK*- 

4.1.1. Rending a Flow Crapll 

Our string parser rmistrnctetE npnrur for lis input while raiding it Mnrr trim] 
left to right, and set will our graph pfirairT- , ()nrc n rureuj* TLiiT the purser will 
look at each node in the input ObEy OOP time. 'From left lo right' mean* 
that tin: pawir will consider nodes in the jiartiat nrdter imposed hy fho Input 
graph: thai is. a node in the input will be looked at by the parser only Urlnttl 
it lias already locked M oil lEtat nude's predecessors. 

An mention^] in tin- last diaptcr. it is natural &0 think of our string 
parser as on automaton uaiR-' a wad head ti> e&unirn' its. input. This head 
ElDVCf from left to right «W the iiipLLT, pacing 1he symbol* rend nn to the 

statr-transition functions, of the parser's active recognizers. 

Our graph parsers will ■examine their input graphs ft* if thoj. too. Itad 
read heads. These heads ahouLd he thought of as ^cnul ti-t rack" heads which 
ton he positioned over more than OT1* node at a tinae, They Start at the 
left, edge of the input, read nodes mac at a time from Left. Jo right, and pans 
inforitiAtioii about these nodes on to the state transition function* of the 
parser's fecog.nijie.rE. 

For example, consider the fallowing graph: 




A parser reading- this graph wmiM start off with its read head positioned to 
tin- left of the graph's two minima] nod^s. like this: 



4.1 Nmt*Di't*rnuui*iir Graph /V**r* 



27 




It would t ]]<.■]] sek'd on* cftho tvro minimal mniw to be read nest -we doo't 
care which. Let us nay it chottfre tUr uppuir une; this would Jeave Eta read 
tU'dd lil the following position: 




Hwv the parser nmat again choose whkh node to read: tec oi say it again 
dlooies the upper cunt.:. The read head would tuove over this n:njc to give 
the following position; 




,'!* 



4 Tin- Algtuithni 



At tttLH point, Oicn' IH only onr mult 1 tin- Wcr onr imulflMr Fit reading- 
Ijiltiitivi^j. the- nwl lufwl W iwri "jlirf to l]u> Ifft" of T]ji- ur:i|»hV ELuixitiiiil 
nixlrt TIjiI^ Ik h(]]] b. fKKWtUllg node whirh lim^t he hum I fire I. Tims, the 

next ITiwl !|t-;ul jXjtiititHL in iLJf follsJWS.! 




FiuaLly, aft*F the lant node is read, we have; 




and the read brad stops. 

We fisvfl indicated tlit position of the re&d head at each stage by denoting 
the unique sot of edges (p0S9ih3y Leading: or trftliiliS edge?) all «f whkh fiitlow 
all Hit n^des already read and precede, all the nodes, vet tu he read- We rail 
these edge sets taa-^ post'itorw, and we precisely ij).irj»rierije the order in 
which graph parser* exanunc Hln* nodes, of their input aa follows: 

1, Each parser its fonsidored to have a read head. The Initial head position 
of the read heau in the input consists of all the input » leading rdgpa. 

2. The parser (-no i-K^nine any node all o[ wjiom- jncoiJuBg edgea arc in the 
current head position- (Such a nodi- is said to be in the ritftt fringe of 
the head, position-) 



4-1 N*Hi-Drtcttrusii#tir Crt\ph Pafsrr* M 

Z. Wiifit a pjLT^-f wliw mid html is in poKJIj^n p i.^hLiimiiiy a ■knIa." ft. it* 

read Iilmh] jjhiwv In Fincw [nwitiiH] ^/ c-ajrnhited Uy taking p and n^hljir tib^ 
«'* incoming &[#•? by its ikUttfiillS 'IK'H, (Wc <7lII ji' Ihe M-,inrr<\i.^hr n-f 
jj, TV mnle fi. all of wh:w epi it filing nlge* nro uhjw in j/. j^ ^jnl ti> he in 
the Je/t /niajf: of p 1 .) 

4 The parser rauiiuiiirs notlt-g onv at n tinie until it rwu-he* a hrad pmsirion 
with no nocks in its riglif fringe 1 . 

The reader Ciin verify that t\\<- example given above meets the ab«yc con- 
ditions- Some thought will alum show that (i) a node is dctct read until all 
its prpdetpasmrs have bc*-n read, and (ii) this mcthodr when applied to any 
flow graph. H-veMuaJty icadn all the noddi in that graph- 
It is worth noting that this method, while phrased *.p as to apply to all 
flow graphs. deserini>i exactly thf motion af onr string jiafhn-i-"* p-ail head 
through its input. "string graph," The string case simply mates no use of 
the non-determinism Inherent in step (2). 

Efm-Il time a graph parser examines a node, it passes three? pieces of Ln- 
formation to the state transit ion functions of its active recognizer*; the. type 
of the node re-ad. ica teft'Einkagc information (a set of pori>r-tlge psuns). and 
its rirdLt-hnkiUjc- information {another set). As- with our read -head motion 
rules, it is worth nutili; that this list deBcrihes- in a general manner the exact 
information read oy the ln-ftd of a string pari;*!. In the string CSW. however, 
the loft-linkage and right-linkage- information are both trivial: it is always 
the case that the only edge in the old hsad position went into the node's 
only input port, and the only edge in the new head position caimr out of the 
pode's only output port. 

4-1*2, Flow Graph Recognizers, 

The- right-hand sides of flow-grammar rule* are flow graphs; thus, the recog- 
ailiTS froiia which we build our parser will he Fhjw graph recognizers. These 
recognizers will receive type and linkage informal \:r_-_ abuiit the input from 
the pr-upcr. and roiupare this information with that fonnd in their target 
grnyh the njdLt-haiid sid^ thoy rw» refoguiiiug. Their s-trttc ture- and ftine- 
tion will he gnuTaliitfttions of tltude of tltt-rr itring- counterparts: that is. they 
wdl he state machines which uiaLo transifiotus Sia^'*! on the input read. 
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Pj B ultp i.l. A CJVinirnnr. 




Fipirr 4.2. A graph grnerntrd by lit giinunaf of figure J-l, 



4,1.3, States 



A state* in a flow-graph, m-ofinijicr corutistj of pairs matching edge* in the 
iitQ£impr* targrt grn]i]l with pdgos ]n the purser'* current hcarl position, 
Fur raampk 1 . consider <]«■ e,rammar of figure 4,1- aud the graph gc-tLerat&d 
by that grammar thwn Ltl fipuf 4.3 A I snino painl in the parse of this 
graph, llkD recognizer for *he right-hand -sick" of the A-ruk.- might ffadl the 
following state; 



i I Nwt-Di--t{Tiiiiiii#iir Graph Fiu-wr.* j i 
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Wo dayi- indicated trif inuma head position as: in the last awli™: the labels 
oil the input-graph nnd t-Rrget^raph «ljp<s indicate the pairing which » the 
ifclte. The torfc^gxaph o*3 ff paired with a givm Lttpi.it edge i* cAllvd the 
target im<i$i uf that t'dge; the Litter ia the input image of the former. 

It h convenient (albeit- redundant) to think nf a state » having two 
porta (i) a set of edge* from the target graph, and (fl) a 1-] ^mspandene» 
between that set and ksiup of the edges in th* parser's head position. In 
this tow. it baccHKt dearer that (in? state* of our string riTogniaers had 
the same composition: their edge set was tbf «lgn denote.] by their Eariey 
dot. and their correspondence was alwaya the trivial on*- stnding thai edge 
into the single rdje in the parser"? current lie-aci parities., The triviality of 
thu correspondence allowed tjs to ignore it and ^mcad" that the states 
Of Cor string reeogniiera were completely determined by their dor. position. 
We do not ha*c tint tiuturw in the graph case: for example, examine the two 
states ahown in figure 4. 3, and consider whicU nf tttew states jjhuuld hefliti 
a transition sequence Leading to an aci-ejilnig state. 

4.1.4, State Transition Functions 

The ftate transition functions of &ur graph r^ygniiers take as input* a 
recognizer state and the type and linker, information of an input node: 
they produce a new recognizer state aa output. Recognizers, operate in the 
expected maimer: they app[y their state tradition functions to their current 
state and the information returned by tite parar'a read head, and I hen make 
a tran-.iti.ni Co the new Ht.lir returned bjf the transilion function (if any). 

The state transition function of a graph recognizer hi heat thought of ad an 
ajjjnril Lint that prorwds in two sjtn-jp*: it first determine whi-ther n trnjiaition 
exists from the girrn state on tbr given input: if so. it then determines 
the state that the transition leatLa to. in other word*, the algorithm first 
determines the acceptability of the input, and then it detr.rt.iiin> t be correct 
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Figure 4-3. Two e-tfttw wturh, differ only in tlmr rrnrT«.pcKitJeflcc pint- 
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FigUK 4.4. Snmr iKraptablr ijslntr, input; puirs. 




next state for Lta recogn i i<.i. 

Acceptability ia determined by comparing the type and Ecft-tinlage in- 
formation of the input siodc read with that of th.fr tArgtf graph node which 
ra-iTDaponda to it. More itfecinely. let s be the the fiim-nt state, let n he the 
input node read. Ser L be the set of input pd^es of ft. And let V he the set 
of target irttftges vf L under s. If £.' ctuwisia of all the input edges of some 
larger graph node n', if the type of n! is the same a* the type of n, and if 
the port adjoined on n r by ea*.]) nlgr in I' if the aame sh llli" port adjoined 
by ]te iitpuf image (in L), thm n is sniJ to be /wccpttiMr. r»]d n' in ^aid to be 
it 5 tartfzt ima^t. Figure 4,4 fltkMFH examples of acr-rprcd)]^. input- situations; 
figure 4-5 allow* MHlie. unacceptable ones. 

Onee the veep Nihility yfi. eui input node has biv-u determine-;!, the new 
state to mow ta ia computed by matrbinfr. it* right -linkage information 
against that of its tofgn image. More precisely, let 3, n, n\ L. and i J 
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FiR^iTfl 4.5. Some TLpncteptablc {itete. input) paiiB- 
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Figure 4.6. He* fatal*, input) pujiH computed hy tkr- state-fcranBiuun MJ^ndthni 
CrOU) the pale?. uT fi^urr 4.4 



be as above, let if hi- the output edges of n. and trt iT be the output odgm 
of n'. The new state 3 f is computed by (i) drifting- from 9 all pairs involving 
edges in £, and (hi adciiu* a new pair far each edge in R. In step (ii), the 
pair added for an tdge « wLich. leave* fi from a port p pairs it with the target 
graph edj^ a 1 in /?' winch leases n* from j?, {Since re and n r have the HkfldK 
type and thus the same pnrt 8H*. this uprration i* wll-fR-flm-d.) Figure ■!.£) 
slug's the (uew-atate. new-input} pairs computed from ttte pains of figure <LJ 
by tine, procedure. Notice that state pairs not mvolVLUg input edges t* the 
input node rend are ttnaflected, 

As tltoe reader may have noticed this procedure agrees with. i.}| H it u^d to 
determine the utate transition fundion* of our string rocugniEcr. In fact, if 
WO take into accouut biith the «lgc mapping implicit ill u*ur string recognizer 
slates, and the linkage information implicitly read by tij* string parser read 
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Figiirr 1.7. A grammar niLiJ a J*tAtf in wElirlL a suls-f iWflgtliHT phouLii be ktsvoked. 




6* $fa- 





Fijure 4.3, Tin: initial B(atc of the H.irl*M!eogni.Mr inm>krd frtitfl the state of 6a - 
urn: ^7. __ 



head, this i# the procedure u9*j to compute the state transition RitiTtinns, 
in our string parser. 

4.1.5. Linkage Mechanism 

Whenever a. [WUgmKer moves into A state whose edgf set contains input* to * 
IJOH'tfirrnJnai. tin- ]iiir$er will in vole A flih-rwopiiEer for that nop-temiiiiaJ, 
Fur example, consider the grammar nun] -Hate shown in figure 4-7. Two of 
the Urget-sraph edges in the state of (lie £- recognizor Eire illpntn to the 
ncn-tefmtrtal B- so the parser calls a rccugniHiT for B, giving Lt the initial 
state shown in figure 4,8, 

The initial state of I he B rwogmjer has follivwed hy ■"tramsitivitv* from 
the port-eonrapondence. information in the grammar mlc for B, In general, 
supple reeosniicr state $ <!W)taina target edi$rs / 1 — target image* of edges 
€j —whirl] are inputs to A nuH-temmiaL u"UV t±'. TLlp parser ifrJetea any edge 
pairs from 8 which involve the *'^ rhoosra a ororUiction P which ilfrivnw the 
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FtuLin- 4.D. An 4ixrptLTl|t itatf fur l)n> rfr^lLJli-i mvnkrd ki fipirr i,& This mLr 
slmultl b* TLsiticpJ. 



5=^ —<^jt C "~ 



-«tg- 



FiguM 4J0. Thf result aJ rlir reductiun invnlcrd in figure 4.9. 



type of n J . aaid invokes a recognizor for the right-hand slide of F. The initial 
state ./ of this recognizer will contain Our pair for each edge e*. *j follows: 
Suppose rj enters n' at port p^, and suppose port p^. is [napped by F to port 
pj cm port n (in p\ right -bund side). Then j> will pair tlie |fiadin E (target) 
edgt entering n with Cji (the input image of *£]. The reader is encouraged 
to Verify that this procedure products the state offipjre 4.£. 

The operation dual Co invocation of a stLh-recogniser is the it him of that 
sub-recover, lit the? exmnpk- given above, the state of the U-recognE&et 
after the parser reads nodr n will be that given. m figure 4.9. The edge set of 
this state contains a trailing edge, so the parser 'will h-duce the recognised 
rule and more tin- prilling S-recngniser into tbe state shuwn in figure 4.10. Irj 
general, whenever a state contain s a targe* -grapli trailing edge, the parser 
will perform a reduction, by adding edge pairs to the caller's state in A 
pro*- ?t\im- which reverses, thai need at invocation time. 

The reader must by mm hr exporting The following claim: this linkage 
mechanism is a general phrasing of the rjact nuwhiaiisrii niwd by r|je string 
parser. It simply inake explieit the manipulations- yf Ebe targi>t-£»aoh/ii>pHt- 
BF.ipii edge- correspondence that were left implicit in the string case. 
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4.1-C Flow Graph Parsers 

Wf lureo w.m laJtrudiirpd tin- llin-i' W" 1 nniiprmrata t»f »ur prenwr nut- 
rtrarriaxi rK-lrfcrkftlf: S lunrhjuriffm for rCAiliug IJ*' input. the irriiULiLKi-rH for 
■uflivJilii-iL niU-si- Ftm] I lie Imtatf' uicriiauitnii urn) *a iiLtiTcnmicrt nvn;*TiiM.TB 
f L( , r (Mcn-H* rudcs. RiiHirr rhatl *tfl*P a ciimrjlrlc "iHHtnu-tuiD intflumkni 
(as wc dill LtC (In 1 firt'vimu rttnjrtor). wo will iuslcnil [(«i1it two rauiipli* of 
gr:Jjj]iiai/ETFi.p]] [ifliri and thn kciiavUw of tkf parser ■rocLHtnU'to.d fur them, 
Tlioflo raiicuplc* will expose eoiiic dec ail* flf tho construct ioij M)d behavior 
fifth* 1 resulting parsers that have not IrtJMj cmisidtfTod thus lar; in addition, 
they will iijtmducn ei irprX^I'itJiLion tlint form* th"? twwi* for that xnn-J by 
our ttiiiLllAticu] fllgoritiiju- 

A Simple Example 

Let ua (tart by rrtltHidflTing the behavior uf a parser taafltiwrerl for the 
following ampk grammar. 



— A— =^ — b — 
—5- =3> ' b - 

when mil on (he. foUtuwing graph; 
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Thin parser rfJtrtM iifl" hy railing; the T^injinijitT for the rijtljT -hand aide uf 
tltr rule deriving W. The pmsiT stark tH euiply. jlniL il* rend hend is owr 
rln' U';uIlelu edge trf" tli*' input. Siljr^ (IwTr if. cuTjJy tun- Ntr-h nlgr, the jhirfsrr 
(irtji ijii rtmirc lit delemjininfl rlu- ff-pctitgiiifpri initial *lnte: Hint im tiiw 
iw vitly jhh 1 pos^ihh' corn^fHUHlmjcT' between Hie leading edges of the input. 
and those uf flic S-rvfttgnimr's t argot fii-a^h. (We confide;- beLnv haw to 
make 1 tins choice in general.) 

The initial configuration of the pari^r a a* foLEnwji: 




+-; 



At tht* point, ofdy one node in the input i« readable. Ah the par bct'h read 
hfrtd reads and moves over it, \tM type and linkage lnforjnfl-tton ii used to 
[(lata a *t ate transition in the ^recognizer. Of course, if the state-transitign 
algorithm deter minr-d the input to he unacceptable, the parser would stop 
and reject the input. In chis (Jik. however, the parser mura into the fol- 
lowing configuration: 




\*>b. 
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This statp contains target edge* which are inputs to the non-terminAla A 
and B. The par.-a-r tbii* invokes sub-recoguLiefH for these node*, moving 
into the following (lonfigurfttLOu : 



r>n 
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Tho following points arc wurth noting: 



Wc aj-n no Sanger using a simple alack to k**p tr.ick of fiib-rccoguifrtT dill. 
Because multiple catla may be made- from a aingU? state, we ubc a "tree- 

ahaped alack" that keops. track, feir cficIl call made, of both the calling 
atatc anil 1hc puyt kular node being rccognit^d ill th<* caller's target graph, 



Thw hojiimor appcaris different from that of the string ifreogniier, which 
left an "Earley out" in front of Mn- node being. d«nwd. In fa^U this dot 
scrnni to identify rhe node being derived— a function now handled by 
mfonnnl Uni k<ft on tliu alack- -not as a stflte nuu-kcT. 



The parser is HOW ready to read another nude-, Let lli aay it reads a; thi» 
Wren it in the following configuration: 



4 A Km-Di-StTtniriiniif Craph P-nrT-rm 
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Th? retogniier fox A has no* moved into an aerating Aziv. so chc parser 
reduces the prflduftiuii involved hy moving into ilsjs runtfigiiraLliiQ; 




—iK 




Not* that tie S-Feco£niK« h3f changrrd state whilp its rail to the JJ-iwogrti&tT 
ia auttfAUdiiig. Thta enntd never bapporj in the slrinj; parst-T. To diqibasin 

this tnntiihiltcy of the SiAtc information eforrd in a graph, paraer'a stack 
thint »f th* stored iriFortuation as siatt vbjeets rath*! than state*. 



WT 



Next, the fc-nadf is rrad, and tJnr: ^rL-cugniztr i-Iia;-^ state afrrirdkigly: 



53 



4 Tlic AlfiiHithiu 





Thr iJTiXfl&mzer ha» HOW iiwwd into All a^epting stat-t-, SO the parier 
reduces ha- ruV and uravra into thf follow in £ ^oliflGUration: 





Nut its that the 6VrCOgni»er stall-pairs derived from ttu? reduction pruce- 
diti r are adifriJ to those of its prior state. Tliia additiviSy\ toother with the 
tllM rihafrfd stack< allows multiple sLmultelWrfifl calls to sub-recoRniaers.. 

Finally, the pr-iraer read the m-node and movrt into the following config- 
uration: 







The parMTS inpn.il ha* Wn completely read, and its <riiilinfi odgea arc in 
cnttBSfNMldfnci' with all the trailing edge* of tine S-rccoguimT^ Targcr gra^h- 
The paiwr iirccpta. 
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A Complex Example 

Thp fLN]|«winij; csfl]]]])lf flciihUi^raf^ tltr hL-ljitvjur <>f a piu*ft wlm^- grFuu- 
fflflf hMhI n-™3-ln™l uj^tLm t-inijliinc | L » prudiirc ■'stag^eri'Li" itivrM-^knis Jiitcl 
IfthiciidLiH.- CoiLHhtrr tbr Bi]](!wihs Rr:irriin;ir- 



5T =? 



h 



A' 



■n 



A 




which <lerivrs the following 'jjtiph; 




Unlike the grammar considered in the ]asr example, the start aymfcot for 
tflta grammar has two ttipLLrj, s« the parser constructed from it must make 
some dcfertdinatifflii as to which of the input graph's inputs rorrr*.pf>rjd to 
whkh of the HtArt aymWI's inputs.. In genera], three it lift way (short of 
t-rjing each pns«l)i]iry) that a parser cm) Jn'Uimino which choice of corre- 
KpojjdenCfS, if any. aNowa a parse. Thus, in our dWrtplion of *Illf pafH-r 
(ami in our simulation alsorithtn). wc will assume c^.ir 1]j<- input itself con- 
tains a SJHH-ifkation of ouc such i:<irK.'!fprmdp:ire. ami 1hc constructed parser 



will 



UHC 



(hat 



one. 



'Htnrf tbf wnniktum dpuritban tal*a Hath fit 
rack n ipKiE^Atikdl an: rcTddj kt hand. 



'Hriritar and pmpn i* nir'it. tlie Lctjiii- Era 
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4 Thi- Attiorithm 



Lt't ns fwiiii*\ rliL'U- thai Hie pjuver fur the al^iyr pn*l ar slEirr* in •" Hi-^ 1 

ftilhiwhjj; wirdlgnTJiliwi tin r3i<- abuvr jirarih:. 





The ^rraisuiacr's Hl*tf foiitru'ns an LupiH edge of the ncii-k-rulins] j|, so 
the par^T activates a rtfo^liiMT for ^4 and mom into the fw] Lowing config- 
uration.: 




►■, 



■V. 



^"|. 



Jt — 



The. following poillta are worth noting: 

• The parser has startfid the A-recogiliser before it- can determine an input- 
edge correspondence for eJ] of A's inputs. When node in is fend, and the 
parser determines a correspondence for A~'& other input, the new input 
Will b<- art Jed to the recognwfir-"* [tlien-current) state. This process W 
called jfojwer«it iniNKflfiefi. 

• Onry those pftirs imrolTing input edge* of A have been (Weted from S"s 
state. This "subtrar tivity" is dual to the addltrrity of the reduction 
profess. 

• CMtrtKurataciis which irivoJve partial state*, such aa this one, will always 
irftult from situations in which the head inmge »f a recognizer's state con- 
tiiitiH some but net ;dl of a non-iernrinaJ's; input edges. In llitse situations, 
the parser w:ll invoke ft sub-recognizer for the non-terminal involved even 



4.1 Ituu-Di'U'Tluiiiiirtir Graph r.Tram 
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Hi- 



if t]n- input rt]m.„ i^nwpiw.nJjiijt 1<> thr ihiii-rcrittiiinT* hlpicr* an' in* 
f>Mtf to ji nodi- in llir ij^Lt friiiflr »f Mu- par»cr* »nd liwuL jumtim, For 
OTJimplL. jti this *<aifi ^ration: 




P~ 



a pane? Would iim>te a iwasmwr for Z? even though, tin? nodi- b is not 
yrf digiblr- for reading, 

The parser is now re;vty to read a node; kt ub say it reads the I nod*. It 
would zwivv Luto cite fn*jf]c>w.::]j coiingiiratjyit: 





in which only the n node is re.idabtr. giving: 







H 



4 TJk' AkttrtJuii 



Tlir otlu'r input t« i|h- irrPwinidyinYitki'il A-rerognhwr IjJw arrived, ssi the 
jMrwr jilIlIk il l» that n-rugn iter's *tafe; 







Hi 



■* — 



A rarcfi]] reader nmy have noticed that we have drawn a two-way arrow be- 
tween the A-rceugnisscT and bhe node it wai railed for. Thus is to ronifld u* 
th.it the parser, ill order to do this atnggerw) invocation, must krp tJ™^ nQ * 
only of which, node a rroognJaer was called for, but alao any recognizer that 
has already been called for a Eiven node, That is. in addition to keeping ^re- 
turn pointers" with activ* reccfjlixeri, tie parser innat keep "call pointers 1 ' 
with nott- terminal nodes that have sub-recogniEcrK. active for them. 

The parser now read the m node , leading to the folluwhlg configuration: 







ThiH at nation is the- CWTOH of one encountered earlier; ifl»t oad of one 
(hut not all) of the A-rrt'ognizer'B hjjmt* having been rcarhH. im* (but not 
all] of its out puts have. The parser pGffcnai a itagycW r&Iurlitiii ritinitar 
to t]j(; staggered invocation it performed eftriJor, and reaches tlk- following 
confi [juration; 
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6T 





-*1>*S 



Til 



The paracr ia now frady to read either the 4 n- or tin? i-uodo. hft u* say it 
choosss n, fliis leads- to the- following (^figuration: 





Finally* the parser reads tie 1-noAi fttid movea into hhia configuration: 




-.J; 



Sn^"' 



TMa fonfifiiirjiirjcHi ;v1W? the couiphcion of the staggered reduction atartud 
earlier, so the parser terminates the A F^QgiiiEix and adds- to the Jtatft of 
the 5-recogniEer. 
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X 



& 



{-> 






This is an accepting cuufiguraUon- 



4.2, Th(> Parsing Algorithm 

We arc li(HW ready to present cmr flow graph panning algorithm. The algo- 
rithm riffiulates the behavior of a iKKn'drtL-nniltiflH:' jji,::>:: pars it. esp-luring 
simultaneously all the parser's irarhrible conjurations. As with the -string 
algorithm of the last chapter, it will he useful to think itf tho alfiorithm as 
simultancousl}' simulating a large number of graph parsers, each of wbjcb. 
eventually makes diifrrilrf fjuesses as to the derivation; tree of the. input. 

4,2^1. Preliminaries 

We introduce hen? ah item-based notation for the cohJiguratJui)* ftfft graph 
parser. We will simulate a given graph parser by constructing ttPEti lists, 
similar to those of the last chapter, which show the purser * ronhgnration 
at- each step of the input. 

The baric unit of tlni notation is a representation t.>F a state object called 
a statu item [or jml item). Items; are composed of three parts: a itate of a 
graph recognizer! a lint of pending falls to other items., and a list of items 
to ref.-rn to. In addition, 'tivniv will sometimes he annotated a* rfffldf, and 
they Will sometimes be marked with aJj i?^aj. (We say more about these 
annotation* below.) As- before, wc m-prreeJit the pointers in call lists and 
return set* with Integer!, aasd we Eiihfcri.pt items With Integers, yielding a 
representation like this: 

[{state) , {call list). (return pointer)]^ 

tMriff this representation, we ran rrpri*mC the configuration of * graph 
parser by showing its rend head pewit Lon in the mput and items fur the 



■i. 2 Tin- Pawing Algorithm 'i® 

islato ulijet-tj of jtll itM- riTii^iiiisprii, The correKpimdc-iji-r part of each state 
I^qrfWtflAiuil will I" 1 iiidicFilnil by labeling ttn- l^r^'1 ^ lt l ■. t input edges Lu- 
Tfllwd. For example, bl the Jipd simple nin shown ahovr. t\n- ruufiffHTHt'tirti 
obtained jurf jkfler Ihr immeatUm itf the j4- and B-rLK-oguijwTH ("Ml he givj-ji 

a* fiillows 



-£ 









[The subscript $ axril ]]cn; art. t?f course, arbitrary.] 

We say thai the parser's, attiim rewgnizcn [or active items) are. those 
whose states are non-empty, hi thr sample abovp. oniy the ,4 and B 
rcraguLzerg are arrive; the 5 recogruier is said to he wpmi&d 

When we winh to ahow the entire nun flf a graph pairer an a given input, 
we can rumpm-n t\\<- space tlscJ by ill owing, after each read operation., only 
tho£0 Uet'tis are active or which made a state transition- Fur ejtfyjjplo, the 
ri;i of a graph parser on 'he grammar and input Riaph shown above is shown 
la figure 4.11. 

While thin depiction of a parse run looks a lot like the pn»C llati of 
EarleyV rdgorithnu three are -some important djjfoMK**. First t not H ill active 
item* m a givuci list change state when the lu'Xt node k trad, Second, more 
than fnif artire ileni in a giver. hst ran represent r^ogrLiter* invoked by a 
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ra^t c 
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[s*-<*>e? ? ~Ml 
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FLpjre 4.11. Run of KTAEiIl jNirarr done ill first MfflmpK 



Singh" parser. In Enrk-y* algorithm, ejbeb active item reprwintccl the «k% 
iwngliitt-r rarri'iKly active fin a< Lwint Lfte parser. 

4,2.2, Optimizations 

A* we- did wilh Emley* algorithm, wc will coalesce the itrau representing 
Kf-riBiiiarnr fur the same rule which were started a* tit? sartte input position 
and are in the same State; one item will be used ti> present rhr state object 
of ail such recognizers, fa arfc to aecraiplwh. this, we will be utiflg the 
two ApliniisattoDB-iniiltipl^n]] collapsing and duplicate- it em mw^inp- 
tkat were introdurod iu the- last chapter. 

Became Uic linkage mechanisim of graph pare™ are more complex than 
those of string parsers, we must be more rArefid in applying optimizations. 
In particular, the- prewnce of staggered invocations and reduction*, and the 
face that malting item.* can change state while their cttiixa are at;U active, 
lead to Cases thai nupoire a very good undemanding of the intent of the 
optimizations. Thus, rather than proceed directly with the statement of our 
parsing algorithm, we will first consider wmc Examples of its behavior. 

4*2.3, Examples 

III tkii section We nin the parsing algorithm oa five different grammar/graph 
pairs. Each aamplc run exposes a different facet of the algorithm • bekavior ; 
between them these examples cover all the cases thai the algorithm handles 
specially. Having once understood all flu- of thea^ examples, readers shoidd 
have little difficulty understanding the Complete statement of the algorithm 
which follows them. 

For each example, wc show aj] of the item lists constructed by til* al- 
gorithm. Each such list is fotlowed by a some explanation of how it was 

constr in-red. 
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ft C ; ■- 



A 
— b — b— 






(jflOtk K*Jl l f«MS ) 



t ** *<> 



- , ^i^3, 



Exiuupfa 1, lift 0- 



4-2 TV rttfyiu" Al^trithlit (jj 

Example 1. 

Thin injiNtpL' iMniKifllcrtf (hu cKcfia «F rftat^ duuj #* in |Ih.-< nillcr wliiK- ,i iMlkr 

\* it til flffjVr. If Jutn^JllCCH flu- F-.VJufl'f UJHTjtfillU. All UptTJltUU KWM at- 

fttmpU-t'um i ]]]»■ wliu-h split* n. «]j«b calling form iuto ftwv 

Thr pn-vfittM pafle jshniwn n i!;r;uiLiLiiU. The i[L[>t L t -^TJLph iji-rived liy tJjiit 
grammar. juxI t\tr irrai lift coustTiifre'tl by thx- smmktioM licfwrr ftJi}' asutk^ 
haw btxij rfticl. Niujf[L-fcrtttLtLii];i h-*V£- Wh predict^! At fJlw point, no only 

chip LLf m. is mxeEBary. 
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4 Tkr Alptfit hiu 



6^? — b-b — 






u*fe as4 i d| ) 






Eiuujipfr J- list 1. 



4.2 Thr Piir*itiji Alftirithw G«j 

H<TH'J*i Hh> llVt jlflcr EVndillgttMll' Oj. |tl Hi 1 Witoi acfcw ,m fcta Tl<x/c rtfnrj; 

tbjit j*. it* ^atcrtu it ju(il-i| mpnistof iljjii ] v , Tlir n^- ww aiwptjibl^ (m 

p*T n-tIkmi 4.1,4)- ninl EtV nwiiLL'iii- IroikftltltiEti Lam tod hi ft Htflti 1 nortiuuiug 
input* i» the uon^muiiiali! 5 and A (The []" symbol* unniui] ttu-rdj^ ei 
and cj in ircm 1 illitlralc that tbe»e ftlges were ppimrut in tls.it iuia~* stntc 
iiiiiiiHtuife]y after the input hh*)c wan scam-u^l. bur wen' \\n-u deleted as a 
ri-Mi.T Lift]]! 1 *ub-rece.£nLJ<T fall mechanism.) 

Only uiie derivation difiiion is posnib]r for thp A-Dode, but two aj-e 
possible fur the U-iLode, fiidutiw];. item 1 represent * the- ^rec^gniser far 
two parwra. Enrb Fl4h made a different derivation dtTMkdi for U. but both 
remain in ttL^snme- state mid were started at the snmc itt^ul position. Tlllia. 
we irprraent b<rt]i with a single item aud use die ca]] ]i»l for B in that, itrita 
Id keep tncfc of oath outstanding ta]]s. 
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If vU. 



(flOJd rcjut I #£ ) 






Example 1. liit 2. 



4 .3 Thf frtwtiif Al&mthm G7 

Nr»# wr rvFul mule An, Il-vrn 4 wan tlir only cmc ad ive ihi tbis noile. *j 
it luakc* -=LrL n]i]jni|Jri.'iri' trFUj.4tLiHj. Et* n*-w AiUr rmT^itfltes [uWirt.mg 
tlvr two D-mtdm by invoking ipproprinl? euli-rornjtijijMT?.: tlii^ happtfi* in a 
intMX* cj!ju-1]y ]]]«■ (Likt m-u ilk Thf liurl But. Iton* &. 0. 7. am] & pure tl*ci 
result' euatinMbat although t]ji- nrr^LLtJWTH Lvr the upper jhhJ Ifiw^T It-node* 
arr being utarted on rhi- suune li*l. they iirf u«( starts] At the muisc input 
|ni*iiiii]j for the purport* of multiple-rail «>| lapsing. 

Items. J ami 3 arc on thip list because, while ihey were mot ecfiv? on 
the node read- fltcy are active on a node eligible to be lead. Tbi* copying* 
forward of ivrl'i'mr items insures (hit. carh time a node i* read, all rhc items 
active on that node will be present (tti the previous ti*t. Thus, we- ncviT have 
to soardi fcmck. through prior Lists for active items. 
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i,flifeii rtt^. : 'p- ) 



[A* -<£><^ J tin, fc^VTBlft, [Hj,, 
L*^<^V Ks,7fifl y fill 



Exiii]i[*le 1 . list 3, 



■J.? The Fiir-ting A!:;tirHhm ^ 

N™ tbr ftm lit-piie. VvV n™t mule An and item 5 njw™ iiitiutn flm-pHijff 
tfate. Wi.- (tan tun* miuplrtp pudr £j „f ;*,.„, 4 j, y k-ftiug: itrui & rrtuxn. 

But riT.-J] tlmt iN-ru 4 ft-prcwut* hm mrafptetm: one which orcdii-reii B x 
via item 5 Arid «iw- wliit-b ]jjvdHed /Jj T ia \\ vm C If n> mwdte ifo- utate 
transition m iiem 4 called fi>r by (be n+uni of item 5. wr will Ijiwh- made a 

spurious stal<- transition j]j the rLtJif-ijiKt'J- which edled item 0- 

The *nl<ih»u is To c«pift item 4 into item 0. This pcwem spiral eh into 
two ilriliv the JiennsejiTation of tho I Wo nragrjiftTE where w*tc niorsrtl in 
item 4. It works by: 

1. Copying item 4 to the WW ituh 0, {This- fcrroa 11* twu item* each "f wtiirh 
:c-p:i^:.: ■.■■■•'.; :;,£ r:n- ri'r:i;;niiccrf jik-.-l-iiI in item 4.) 

2. Removing the call to irem. 5 from item 9, removing tit? cat! to item 6 
from item 4. and renwriag the return to. it<?m 4 hy ittru S, (This make* 
each of item 4 and item 9 lepresent only one of the two rerognJKTH that 
were merged in item 4,) 

3. Gmnff through all the cajleos of item 9 and informing them about their 
nw caller. (This keepa the call list of item 9 filid; the return sets of its 
wallers consistent-) 

4- Going through ail the callern of ilem & and informing them of their new 
eaJkc fThia keep* the return *H of item 1? and the call hat a of ita cattera 
consistent.} 

The effects, of the c-apbt operation can he seen in items 1, 4. G, 7, S, and 9 
Of the current list; once the caput is complete, item S return* to item 4 u 
described in nee" ion 4.1.-5. 

Note that items 2 and 3. which have heen brought forward to this hat 
because they are active, are Unaffected by the c-split. In general, none yf 
the 'siblings' of a r-split item arc affected by that split; only its 'parents 1 
iVailiTf) and 'chiErfren' (calleea) are affected. 
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3^? — b — b — 







Vb/ c- , a* en , w] 
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Ex:unp]f In list 4- 
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I?<™litLu ]j(!clc ^ ] juts it^itL 7 in ; H j <u rcpling st at iv TlLk? [>HjxEt* ili; lit 4 
mtn t>iih Id jitLil if mi 3 iuUi ilrlii 11 for tlsc sumc rmsm* jli jtiiu S r- 
split ilL'ln 4 in tin- ]juf lint. rh]]]y (hi? tiitu* it j* tb- U = W u\r t\,tf tuis hHht 
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5^- 


^A^ 


A4>- 




~§-s 


■~b— 


S3? • 


— b — b— 




(.nfliii reii i b, J 



fc,7 






[a* -*;"*'>:- fa, i^ K^l 
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Example li Etxt 5- 



4.2 Tin- jftkTNjujr Aifpiritiiw •?•> 

Nhi^v wc tm^jie! ii:i:|r fr. Item." 3 ftUil 3 t-luuigr *ue<'. juitt Unit 2V Nrturn 
i-^lil* ilnn I intti iliNJi 12. iNTn* 4. 0. Ml.wul II arc JLffwtml liy thinaftfil. 
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*1^^b t bjA^/V 
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[nftkrtSwi ^ Pi } 
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ExjinipLc- 1. list C 



4.2 TIu> AiryfUff Algorithm <j r 

Wi^ read fc a . and tk- Khatami of ^puri^i* par^ i^n*. Itr^ 4 £U nl in 
art' fulivr mi 1] 4 «- tiodrmul hut H is wit ^^ikNiIi]^. *i EliL-tr M-nifiiHcnt r<- 
jwt. Wr nuuVar* tin* by mnrkfaqr One irc-m* *< rfrW. jl rfffs which wa* not 

wrraary in rl* rfri.Ls raw-, n*. }i »\ nl kw 5s thit. nit I &h U«* B-r^tii*. 

era iqimniird by tlmr U w haw rcfrictni. r| l(T n i*y Iiflw i»citLttu« ™Ua 
to wtlKT Kngnaov. If M, IW ,rtk* nwiqruutttf WTl . ift^-d «ln n ■turn to 
their dead cahYrs, thre return* might cm#. 4aEe train Ikhm wllk-i lead to 
Hpuriwiis par**. By marking «* it««J« for ri-jertiiig rcco£n:zt-r* with rfecuf, 
wv wii! know to suppress future returns to those items. 

ttcvidinj; 6a ab»n- p«t* item 6 into an accepting stare*, hut its return to 
iteme 9 and 11 doca mil cause thiqn to r- split h^ansa they have no other 
recognizers; piMadinj; far tint U A node triAl itiiiu 0- completed.' 

Ite-ms 1. 3. and £ art copied uiu.'hau.gx'd from the previous list because 
they are active. Note that item g, *]ih™gh it eoutaina qtcm 10 in ita return 
list. i» unchanged by the fact than L rr m m has died. If i torn & were ever to 
mnve into an accepting state, its return to item 10 would *injp]y be ignored. 
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ExiLKipLe 1, !iat 7. 



4.1 T3k' PhKiiiff Algorithm 77 

WV tvwi t\*i*]i- C|. ltnu* ft unci U <3jc. Iu-hl 9 hriv:-* jiiln rm jLcrrptiujr 
jifriU-, i-jitlttug llnjjf; J Hind 12 tn L'-*|>lir inl-u Lil'U^ 15 ;itul 14, This split 
Nik-* |i1kti> row Mmi^Ii ihhh' of I In- pmdii]^ rail* frw tlir A-nodc muqijrf^d 
% itmji upp Itt livL' itciu^: tin- dcaricm whHljcr ch iuj( to r-^nlih ii* ttuictt 1 

tufofy oti tkf ham of Ulliuhpr uf p-nnliug r H ilh, On \\u- titljri lumil. hwai^e 
\h-\u* 4, 10, tjn3 11 FLmhiiKL dtp LLpdnt lill; uf tkoir rcfisnj sa't* run-mull)' dour 
by t]j*- r-*[jliCs uTiNiiis 1 aikI 1? ^ur suppressed. 
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4,1 The Purvhiff Algorithm ? g 

ttV mu\ »!)(](• en. Itoiti 1. vrhkh is a rtvp-tinri>| iUia. iikjwh itilo mi w* 
c^pHiiS Htjitc ttuT-t inrtuJi-ji ^11 ^>fila IraUugcdgro: Uk' input is arrcptad, 
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E&uuple 2, hat n. 



4-2 Tin- rawing AJgimibjti *. 

ExampEe 3 

TLh ixamjAr vqAutv* llu> iiitatorHun :»f multjpl^H rdUjwing r-itil *r, 1(f _ 

BCTdl ijiVlC-rlfioiJ. If ]]J<™dH«W tlw f-.tjjfti dpiTfltiilll, in Wljjrtl ,-m item fa 

splir into twn btc-iiiH as 1hr rr'niiJt L »f ; L ^ri^irtioii nprniriLm, 

We at art off with jiuf twu itojjjs, ujjp for carL demotion of *?. 
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S=s-^ b 



ft^-i^ 



c^ 



A,1" 



t> b, b, 



a^ :£>- 












Ejciunplc 2 4 list 1. 



4-2 Tin- Ffirnui* Aljptritltta gj 

U[>ou n-ntliuj; rnn]t- fij, >h itla .V-HTOKiiijtrr*' ujake H:,'lij.-iL[Ll>ii.< iriNi a riiitc- 
i-«i[Lh'ui)i]jK JLtl ilijHir »f ,4, Shire tlw'twti pTJipiijtiTf; jirtcc i%t> Llh wljicli Jupiit 
ihIrp U itipiri Ui which purl eif A ruliHiplc-nill c-triljipHiiig rnJtr* place rau] r h^ 
4-ni-t^LLKiTii invoked will vmh rrhin] tci ljiilh ^-riTr^uijMTH. 
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4 Thr Aigririihiu 



A^ l£>:- 




Q s ^ JSj 



fi^Mg-fsMi bj ) 



, -*, *v*J, 



I ** k 



Example 2, Uul 2- 



4.2 Thr Filming Altivrithw ^ 

Wo rwitl ic»r^ frj. Ku JH ] ] JH ]|J, ^-rniipiisicT? iHFiki 1 Jippptpratr Inuiaitioni, 
Tlw LtL-iikH ftir tin* .V-nvciKiiincTi' fin- riHjvo ; uj<J utirljjitL^L-il. 



8Q 4 Thi* Algoritiini 



$*-*C>- 



S*-<^_>- 



*"* -V 



C^ 










.* 






Exttjjplp 3, list 3. 



4.2 Tfji* Purmtg Al&itithm S7 

Wlirn wi' nrad nude ^j, Tin- nV.iK^fbjTiiiErr nf if«a 1 nwmv into ft itnta 
roiitaiuiug Glw other jjnwf *if A. Thi* n-eognift-r luit^r peum this tu«w input 
cliTwii Nj (Ll' .4-rLTHj K jjia£'re of items J .ind 4, but Uk-hc LIceiih nUsin n-pn'snit 
reeoj*njjuT3 jjivnitoil lij itrai 2. whieh dor* m>t wnui to [muss iluwtL thin ncnrntl 
i 1 1 1 = i • r . 

Tills sdtiiati-m] is amijdi-iiiciiCEiry to that, iit which we i--Hp]it a caller, 

and the solution ;n .-Jhcj iNiii^ili-nKiitarr; we p-apljt thf fallee. By this we 
mean we split the represent ations of the two nxo|piizera merged in item 3 
anion;" twi> itntiss, and wr Jo the same, with it^iu 4- Each of thou*.' p-^pEtld is 
wtmuplifituixl ?inu']ar]y. for item 3 we do it by; 

1, Copying item 3 to item $- (This gives ue two items, each representing a 
recognizer invoked hy two parsers,) 

2. Removing item 3'a return tout-cm 2. item 5 'a return to item 1, and item 2's 
call of item. 3. (Tilts mates much of the itema represent a recognizer 
invoked by jiia-t. otic of the two parsera.) 

Iji the generaj ease, items 3 and 4 might have had outstanding calls, m 
which rase we would have made their calLeea return to both Ihtm and their 

The result of the p- split ia that we can now pans down the new input 

from item 1 to items 3 and 4 without hurting the recognizers invoked by 
item 2. Thus, we are ready to read the next Bode. 
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5**~< 



b. 



J»-b> 



5 * -*C_-1^ 



-*i 



/ 



bi 



j£c,— 



Vj** 



(mgc-rcptl.-- 0^ 1 



— "ST** 



[a ' 

b + ^fr,*,^ 



ExnJripk 2, Hat 4. 



4. 2 'Tin- fru-sijjtf .^JjjrtritJjijj 



#J 



Wl- r^ncj jjiwli' ^j 2 . Iti-ijj 3 (hen ei] ht rinir n ,.. JlhH ] it<in 4 mak*v ,- t noun*] 
Ht.itf tftHwiticuj.. It^m 2 nunc* iikNi a ?tnu< nmbiliiliiR fa /Hunk-"* uthu- 
iii|Mih; tills LnpLLi ran be poMcd diwit to itc-m* 5 iU1 ,] r, wjr]]< 1( ,| ]i-H|:,JjtUnj 

IJj«']jj Ixvb.i]si- tln-y tuivc iil> iht]j< T rdillrre. 



DO 
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ft^* -^ 



A^Ib- £ ~ 



^b L bi. 



— &l' 



6*«ii«?*<-i c i ) 



C+- 






Example 2, list 9. 



4-2 Tiir P-ir^jj^' Alsnuithm fl j 

iutu arrcphng tfATe*. Tlicy tL™ return r« iuian I junl 2. Eiuth <if wjjidg 



D2 
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A^ — ± 



4^ -<T1 









r ■«■— X ~^" 






■5 
j + 



Exmupk 1 3, list Q. 



4,2 TTj< l P-u-.ijr^' AJ^jrifJjin 93 

Example 3. 

This cTXJimplL- ilftuctliftnifni etafgcml n-diK-tiuii. N«i iirw [tpcriufiim* ajv 
iutrrn]i]C('(l. 

Tin- (in*f Eix1 str-irM wit]) Lbi-tuH for two jiJrtTaifttr ^-rpcnigmini*. Butli 
tif the*? n^tgnmT* rapvt wi ,4 -it the win*. print j u tln» input. «i two 
-4-rrrnpjizci^ arc invoke and trails will n-tiUTW It! Ira*]] d-ntugiiiEcn. 



$1 



4 Ttu- Alsfirithm 



5^~i 



c — 



*-b — 









c,— 






ExJUtLplf J. llit 1. 



4.2 The Parking Algorithm 



SH5 



N<n3e rii i? h-em], ]<mllIleiu iNHn 4 itiNi an Furrtptjng rfnti- for l]ji- nip- c^ 
Its tvlimi rtuiu-* items 1 4Ucct 2 to e^plit ii|t<i ilnim 5 fuj<] 0. NuKcr that. 
nftLrtlgh itriLL 4 En* in-turned tnw uf itn output*, it ipuuuiih in Mk- eiut Jisln 
(if it* ieiIUth. Il will Ijh- rvincww] only wEh-Ii i1 fi+nrjw ;*|| nii^ tnitputfl. 



■>. 



4 Tin- Aiwritliiu 






A^ -a; 






A^ -<*- 




fel 






Exn3]ipEc J. list 2. 



4.2 TJjr F,<4j-*ii!^ Alunrillim 97 

Wmk> fc, is iml. IcEuUiJ^ ir^-rn 3 into Ait Mvrjrimg Hhite »F fttc «]^ * a . 
ttciusi 5 ftttt] G mrd Dtit hv t^p\H nurc hl^y |j«v(- uo <n||i T i-.Ul-i fn^dilJfl fmr 



<)8 



4 Tfo- Mcminn 



*--b — 




(h=U ftiO.: b^ } 






^fa-Pt 



5 ^ -axt** >- fr*^ 0] 



Example 3. list 3- 



4.% T\\\- r.ir.iiuE Afe^nfjjm gg 

NiMfr JW is uwul, killing it^m ! whittl *Yiu« E-jCftcTtin' fi r. Iu-chh 2, 5, Ainl G 
nil c c l i i k-i Htfltr famaqHiOM: iN'tn* 3 jirtit 4 vav i.unljjutp'd. 
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A=^ ^i- 






il. 



X h ~^ 



c,— 



■D a -^ /*» 



A^ -«v. b „ 



{noM nc*^. : p T ) 



\^ 1 ^] : 



[a^-<^ v hit, n,*i 
[5=^ -A 



-,"*>! 






& 



J t 




Exmupk i. 15s t J, 



4-2 Tfjr Piuniut; Alifttrithm \qi 

Wiilr if S jji nvwL ]cfu1i]jk Imtli ih-tiM 5 jirid 4 jultt EurcpliiLg nhiM. Tljdr 
return* arc mu-vcufhil. 



|[R 



A TUt' Afowithirr 



5 * -K 



A*$ -A 






A^ -***■ 



A 



/ 



— b, 



W 



I p d^ re*A- : C | ) 



fa--. £ 



**£.$ J t 






x^ 



ExiuftpL? 3- li*t 5. 



4.2 T\w Primus Aigttritlun ipj 

Nock f.t Is p'fttl. Tin- n'Milta an* xlniitflitKttWJirri. 



MM 



4 TV Afourjfljipi 







y^b- 



(*>a*t, ^&v4,j,yqnc ) 



[s^rAC^ , ((All)), (fl 

I 

[a * -^<~ , -i , i ^1 



EiftmpLe ■!, lint 0. 



4.2 Thi- P/trHJjjjf Alxorithui Kl^ 

Example 4 

Thijf cmiii]]]£' <'3?pk^^ ft]] intrrjirritwj lirlweru rtrtRKiTOil rtclHc-riou raid tttip. 
Bratc-itrm tmTj(il(([ which tljf tf.u]*v \nny tiuvc Fuitirip^iN^d flftrr s'l-inj; tiu' 
pn-vidLis <-jtriiupli'. No n™ (tfMTnli(Jim an- Ltitrrxhicmi. 

We rf-art by jinriirriiig tin? X-nodr lit tho S-nxoRnisspr in nil poadhfe 

ways, 
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i & — J 



in] 



[a*- 



,W 



&, 



V 



:b-t- j "^ 



. U U 



Example ^ bat 1. 



4.2 Thr rfitoiutf Alfrrttihta MY7 

U|jnitL ti-ntlmg umhtij. hnth rite a-rwugmiuM? jiit rctidy hi -iwffpt wjbp *|. 
Tin- rt-inru nif ititu 2 cuh^"* itL'iit J ten L'-^fhlir itihi su-uj J. but then the n+uru 
■ -f :"i-::. -1! ::.i!\e* it pill 4 ij i ■ -' ■ Njc wimr i<1<i(r ■'■■• ilriEi I si H i- : uiiTga] into 

t]]Jlt LU'ttL. 

A |j»jjit wiirtli. noting htTf jd That im]y italic* which wrv oii^LELnlly c-nplita 
iif riwli ot]j<T *'Ali ivtf bi.' iiscig^J. Thin is* hurnnsp r-^|)LLtci]ijj Li the only way 
to gvt two ittfUS WcLkll rfptfil^lit tli^ Bfttno [wogijiair tfMlfd at the; sums 
input position, 



JOG 



-1 Thr .-Vfririthm 



5^-A^ 



— a. 



{tuat. rc**L: b ) 







Euiuplr 4 > lint 2. 



4.2 The Pwnhtg Alfririthw j ,; jn 

This r-aiin-* iti->itL J Ni tic i'->*ji]jr mm- ajjaju. lllin iLHir inhi it™i 5. 



] j ■::■ 



4 The Algorithm 






— a. 



(ht»4. f^**J.: D- 1 




[s* -Aci , ^>1 



[A=^> -ftC 



[5 ^ -a: 



l.-fcfc,"*, 



ts"s7. 



4; '* 



a, ^1 



Ex&iuplr 4. list 3, 



NihIp ^ i* rr'jw]. Th* neurit* jtn- KErju^Mr^raartL 
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5^ 



■*r b> W 



c^~ — 



T^ 1 



— Cx; 



A^5ACS_,. 



k — *-j. — b j — *j~* 



{m*fe v=*j£rf**rj 



[S^> ^^C^A-d , n^, p] 



Example 5, lint- 0. 



4.2 The Piping Algtsrithtu Uj 

Ex;sinp]r 5 

Our Inwt CTfJHUiiJc [f tnir iiuj^t l>hh|i1ck lHji*. ft ilnjjtrewtrjklcn left mumon. 
iuul iiitinHliirw tJji* r-*pkl njH T aHijii. Stajltfcm] iuwic-jiriHtiH eu»] n-chirtitnu 
ajv OjDiWIl m far #*m] mrasnirp. 

Hji]]^ A art out amply nitmgh. an wv LqtoIw ,i iwofiiiiwj far a. 
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6^ -<l b >: 









Exaapte 5, Hat 1. 



4.2 The r/iramK Ai^irithm 115 

Hurling w.n[i- 'f[ num? f]lC tf-TTTOHUMCr Lutrj ii M-t;L(c wh^T* 1 it ]]JU|(( predict 

{Hjcuftlur S-ntmlr's ]]j]nirs. MultipL'-adl iii3].L]wiii|j |jji|i[n'CL!' at it did ]n thp 
riHiifi «ue, with lurfh Ml'IIIh I ;lim1 H «vu!inis ilmti 2 In th<- iwu-redLrHiw 
ntpfttwimu <if A- UtlLitc- Mil- Hirriijjj; cs$t>, Iiuwwt. ikiu 3 dues nut H'xpLirir.ly 
fall it writ hTcuf^ivcLy ; instriisl. it is niiixfcji-d with the R^jlaf. 

H-riv rirr- the iiitiij tiuisi behind thin rtL&rJu'r: [f wt.- mirrjiln- by '5' an 5- 
rcrogniz«r. hy >!„ a nuu-ra-e'iirsi-vK" ^-rnx-o^di^r. and hy A. a recursive j4- 
recoguiicr. then thp sacnuktion is airmitty jri^-wirting an infinite number 
of parsers,, each with on* of the- following, Mil sStnucturcs; 

J -A. 

J -» A^ -» Ar — Am 

■ 



The gimnlation docs this hy kreping just the two structural 

5 -A. 

(and it shares the A and -in between the two), The the R-fLag an item 3 
indicates that it is the rtfursivf ,4-:retf0gihier which ie being used to encode 
the infinite sequence ofsueh rwugiiizcfs prcwul. in the imop timiicd case. 



11G 



4 TJj<- Algorithm 



* h ^nS 






_ c — 



-if-; 




A^ "^A: 



-tj — -a. — ■ 



- ie, 



di-^ ■ — - b, — - ^j" - " 



(i»*it r«Ajl = D* ) 



L ^- — -" j i -»x 



Euunpte 5, list 2. 
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Node b$ i* nfuL miming |]jc iKKH-lKTnndffv X-rreogpiizrr I" uiiifce a Dfjit^ 
trajiiitlicnj. 



118 



4 The Al&irithm 



s*^*s 



5^ -^CJTA 



A^ _ b -^— 



A * >,- t>L- 



-•3 



fa A 



^1^ Ai " 



■°j— V 



(H£'-vJ: b,^ 






]e 5, list 3, 



4.2 Tin- P.irujjjrf A^nrithiu ] |cj 

Tlii* i.- m ,i|-,- -.||,. ZS'- > 1 1 ■ ■',■_ i.- re.-iil. ?.'.'.>■■■*■■.:.£ ■-••t. | i.-. ]i,lhh ,|.iwii flic 

irf]j*T i[i[sfLt cif kis /t-iiiuli' ret rt«-uu 2 nod 2. 7Uh Hr^t jHspIiU [h-m 2 'miv 
tl^cii A. Wjiwm' ih-in ^ * nJrfl. ml I ii I l.y iti-in 5 #UjO (lie iWi^m rn [»x|i]ir is 
iwuk auHy («u the iMUti* iiTziiixq1ii>i nf raDrfflH. Itn4 <hj whi-ttut Hn-y iiRrre as 
hi . l: 1:1. cl in|:i|^ TStY- V-hvi. m-ni I |!;i::r-:- I ; : i n i -v lupin r l: jj. : . |M n _,_, : ........ n 

#.*■* Ic |wafl tE uaa tct itjcqjj 4. surh fin JMrtitn] wtnalj] uulIJ!- item 4 imd item 3 
ralLs <ni tlto stout' n'rnjjuijwr At thr sniuc pn:mt, go nukliiptr-cflil fotlapaing 
art? by rrdirt'eti]^ item 3'si CeiLI to item 2 iii&trad n( \u-m 4. Tliis in run* 
re-mewr* t]>r liiat XPtiim pointer from ttosti. -1. an if t» ujarLrd aa Ac-ml, Tilt 
nrt effect ia tt) (fnrnvTjy) k <T p item 1 .-jj<3 Ltrni 3 birth codling item ? fftf thr 
jKm-rrc-LLrtjve H'Xji:nj*i<jiL of their A-noAm. 



nil 



4 Tin- Afotritlini 



5^ -K 



A=^ __l^a. — 






---- ^-lei— -O. — 




■bi-ir*! - ^ 1 !— ^ 



fmfe r&uli £, I 






Example 5, lift 4- 



4.1 Tltf< P.irtiiitf Aiginitlun 121 

Hf&tllng [ifxit- C| UHimi the iicm-rmirnw n'rcigiLixfr tar A (tfft:i 2| infn 
ari jin - c]i Hii -H i'Ih'lU' Tliis cause* item 1 to e- split xia iuiio] \utu itn'tn 5. Kent 3. 
however, dws an r-*jvh'i into jtcqi) 0. whir]] meem* i1 rUnis a iKH-itial (--split 
L>i(f tlicil bittra its r-flag *Mn] parks tip ifs split image as: a cr-Jl^r. 

Once fi^.ujj. I i-l us luok at the LutiiitiiUit uiulcrlyiiii" this Aftii.ui. If wp 
Hai; a nvoj^iiztx thai has ik'Huikm] with an atTurLHic. ajicl ft rfttfgliiZfr tbaf 
has made a state traci^Tion. aa the result of a return with a primi 1 , theil the 
paracrs. being annulated aciw have one of the following call structures: 



5' - ^; 

5 -i. X - K 



Ju^jc^K 



Thi simulator now pqirfSfH-ta the** 1 structure* using the iblWing three 
structures 

s - < - a: 

where item 1 i« S', item 2 ia 4^ item 5 La S n it«n 3 is *1t, Mid item 6— the 
r^plit of item S — is Ehe 4, recognizr-r nagged as the representative of the 
infinite chain. A simple c-split of itecn 3 would have been insufficient to 
build thi^ :n'w structure betaum 1 it would have m c ide items 3 and 6" siblingH. 
instead of child and parent, In addition, it would not have correctly marked 
item 6 as the A r reewguirrr instead of item 3. 



153 * TJjr AJ'tfuri'JjM 



^ b ^A^ 



5^ -<I>A 



A* _£>*.— ^ b ~c,— ^dr 



-A, 



■ '" :■-:;:-.,.. 






[j A -<^?AC , ((A fc 611 , rf ] 7 

[s*-*c!>a£ , ((A ^) ,*] r 

[A^ACtS",^^,^]* 



Kx.vtipLc a. lilt 5. 



4.2 Tin- rxraiif: Alwiritiua 123 

AfU'r FOiuliiij; ttif tLf-xt r-tinili\ flu* wliitti- r-a|t(it piNi^n* tyqip^ij* ngruii. 
Tk' rtJimhtiinj i* now htv^itiK iln- ftiUttwijju #trtiriuiv: 

A" - K 

s* - A* - a; 

Tlw reftdcr -shouM b* huh* he understands Ijow tins stmctim 1 was attained! 
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5^ -<!>- 



A=^> -b-i 



h "*-* 



-* 



^ 



■. _ - _ 



- 



A * ^ACt^L- 




■ k — ^-fe-o,— *-i 



{jute iaJ.1 dgj) 



,ti,'U 






E&iuipte 3, ISat 0. 



4-2 Th? Fwihtf Algorithm \Jb 

With tbe muling of mi*]*- d;, the bttft^m ihinllj begin* to mute off the 

s> -. a; 1 

5 -i A' r - < 

which Jifffra from dw± at th? Last list in that al] the Don- rerursive ji- 
Hwopmfirs have ^one a way. Tfu- reader should ooniidrr what tilt structure 
wnuld Live been if we had read another c node And fttntr-d smother level of 
recursion. 



via 



4 Tin- Mffttritlmi 



$=$ -K 



*a; 



A"* _ b I^— 



—<*; 



*>— *£ 



* 



a ^ 3;a- 



■lo— 4-— 



■bj." 7 *-i &j-f- 1 '- 



l'(t*i£ r+i J • Dj } 






Example 5, Mat 7, 
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T)«' A' r ' Tivttypuwz nu tLc :ini-ti»m of tti-r h!;wk [milieu n tnLittttiuii an b$. 



12S 



4 Tin- AlfpwiUmi 



5^ -< .A' 



— c: 



^ k -A. 



—(A 



bj— c. 



^ ,tJ ' w j 



A^AC^- 



\ \ \ 



£***£ -ftwtr fljl 



&ii J J -■£" 

r £ r _ 



Extimpte 5 n Its* S. 



4.2 The Pairing Algorithm JJO 

Ttti- rlA.t'k iuiw]]](3s luitrtlu-r U-vl'L. The finn3 i^uifigiir.itiun Ipi'jit Ly I - hn- 
sittinJjLtihT is: 

S" 

S — « j4^ (ik'm 7 — * JEnn C) 

5 -» A r -* A\ (itrin 7 -* item B -» it™ 6). 



130 i Til? Algorithm 

4.2.4. Algorithm Description 

We arr im* nady Ui Ptjitr onr HnW graph parniinf, alsorilhin. Th<' altforitbin 
takes as input a Etiiw |?rainntar and a Hu* Emph: it del ermines whethor 
the graph in gt-FHT^tfKl Ijy the grammar. An wikh oiir version of F«iri«'y'fl 
algorithm, (lie output of the algonl h cll ix .% sn-qtu-nr-.e of item LLsHs ■ urn- for 
eaeh node in (lie input- which represent All the pusflibU? configurations of 
a non-dptcrmmistic graph parser when, run on that input. Tin* algorithm 
doci Hot output a parse tfee for the input, although it tan be modified to 
do so in -a manner simitar t« that prevented in the lost chapter. 

The algorithm operates by using a list of items J H to keep track of alt 

tin? conflgiifAtinus a parser might be in after reading the first i nodes it 

chooses to hnh.i1. Given listii /g h \, tin 1 algorithm constructs list V,. 

by using three op eft 1 1 ions: a jearentr operalioii. a predittor operation., and 
a amtplctzT operation. These operation* in 'urn u*e ehr<ie sub-op craticnifl- 
the p-ipH*. the C- spirt, and the r-apiit. We first describe the nature of all 
these operations^ and th^R'tiow the algorithm lis** the.ni to construct the 
lists Iq, Ji 1 . . . , /«, 

The P-Split Operation 

The fl-sphr operation takes as input an item i, a non-terminal node ft which 
i is deriving , a item c which ha* called i. and a list J,-. It performs the 
following actions: 

1. It crca^'H ;i new item t' whose slate part is that of :, -whose call llfit ifl 
that oft, and whose return act in that oft except that c in retnoved- 

2. It adds ? 10 Jf 

3. It goes through the live callees of i and adds f to each of their return 

A. It gyt-s through all the live callers of i except c and replaces their calls 

to i by rails to i 1 . 
J. It changes i'a return set to be the singleton {e}. 

Items i and i 1 are sai rl to he jtj-split&tit each other. Thin rotation is transitive— 
any oihti pj-splitd of i M* p,-splits of i* [aud vice versa] -but docs nut 
persist across lista: a Pj-*pb"t of i is not a pf*plit of i for i j* /. 
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The C- Split Operation 

The r-!*ph't operation taken jli in pur an item i. n cnnn- Terminal nude- n in 
fs tnrf^H-l. Hrjiph. ail item c wlikll t W railed to derive n. and a list J,. It 

perfumiH tliu following actions: 

1. It <"l*ftl.rfi A iJH-w itrru t* whone state part ia that of i. whose return list \* 
that, of i. and whose call list that of t exfi-pr r.hnt c is removed from the 
calli.^* Jw it. 

2. It adds i' to J, 

3. It goi^s tkroftglt tlhf live caller* of * and adds i' J to arty rail lis* on which 
t appears. 

4 It gin's through alt the IIts caliper chat i hrw pending for nodes other 

than fi and adds t' bo earn of their fetum Seta. 
5. It goes through all the live callees other than £ that t has peBdittfc for n 

and replace* i by i T m their return wis, 
(3. Jt makes e thfr only calico for fi in i. 

The Items i and *' arc said to be c-sptitit of each QthfT. This relation ia 
transitive — any other c-splita of i are f-splitn of i' r and vke versa- and it 
persists across lists: a caput of i created on a list /, a r-split of all those 
created cm other lists . 

Th* R- Split Operation 

The r-Hpht operation takes Lhf *Miiv Lele^uLh i, n, c, J, as the c-spljt, It first 
performs a osplit operation to produce an item i 1 on tiat Jj, antt then lakes 
the following actionfl: 

1- It nwjla f with the R-flag. 

2- V remove* the R-flag from i, 

3. It adds i to the rail hat of n in t f , and adds- t 1 to the return act of i'. 



The Scanner Operation 

The *c<uiiier operation taken as input an item t from list Jj_| and the j-tfa 
input node to be read rij. Let .* be th* 1 sUte part of i. If a is empty (i jg 
suspended), the manner do<-s nothing. If.* \* nun -empty, but none of its pairs 
contain edges which are inputs of n-j, then, the scanner adds i unchanged to 
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Iwt fj. lf^ cknv eoulain itipiH* tif n ? , but n, is dHiTrtli-TU^ to be mnwrpfflMs- 
a* dcs-fitW in station 4.1.4. tin- rfiUiuer marts i w J^nf and adds it l» List 
/,. Fiualty. if t[j, m iin-eptaJdr to if, Hit' rainier (5) compiilm n, lii-w atJiie / 
a* ik*criin*l iu wriimi 4. J. 4. rijniiRcn tin-- ^t -lT l- pari <if i to V, and juMs i to 
ttdt /j- 

Th* Predi.tt.or 

The predator operatirfW tafces ju input an Item i' from liat J,-. Let a h? the 
ietnte part oft". If none of the tarsr-l edges- of .* are input.1 to nun- terminals, 
the pTedietor dore nothing. Othr-rwiiie. for rach noD-frrmltuii node n which 
him OH*- or more input edges ill ■+■ the predictor distiflgLLisheH two casea: 

Cane (i) The item i already ha*, rails pending for n. 

Let i'i, - - - ji'nq be the live- iti:ms t tiai called For n. Then for each i lh the. 
predictor docs the Follow ing: 

L. If i* has falters besides f. the p«4ktoT invokes the p-split operation 
on iif, n. t, and Ij. 

2. Let a t be the state gutlen by augmenting iV» current state in accor- 
dance with the calling conventions of section 4,1-5. tf the predictor 
has already added a p, -split i r oF *k wkh state-part $t U> Jj, then the 
predator adds i to the call list of t r and replaces it by t 1 nn the call 
lint for n in i. Otherwise, the predictor changes the Stat* of i t to be 
st and add* it to hit Ij. 

(N.B. The item i" J referrer] to above may not still be in state s k whfta 
iU return bat- is modified, sanee predictiona ocCUFTine, between this one 
and the one that added a" may have modified the state of i f . In atich 
6 fas*, the predictor r till reuses i' J rather than create a new item.) 

Case (ii) The item, t has no calls ponding far n- 

Let ft be the type oF n, let il™. fljf. . . . ,RZ be all the recognfoera for 
grammar rules which derive ftf , and Let sj.sj, . , - , *£, np thr initial BlatM 
of those recogniicrs computed as per wclion 4.1-5. For each 3%- & *^ p 
predictor baa already added a new item t* with it ate. part s£ to list I 7 , 
the prcdif ror add* t? to the tall list for n en i and adds i to the return rtt 
of i', Otherwiac n the predictor creates an item with State part *J, empty 
call List, and return set {i^ if this item is for a lelt-rrcuj^Ye rule the 
predictor marks it with the R-fUg. Tlie predklur then adds this item to 
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/j and to the call bat for n In t. 

(A\ff. The point ruade above aUimt pcwitiM? chanRCH in the jf+iitf :rf Mir 

items i* also applies in this rams) 



Thu ( I-iini .I..-I 1-3- 

Tln' n'OE::fjli-riT iijjHT-iHsujj lakes as input, ait item i on lbui J^, Let * be the ataie 
part of i. If lluuc of the target edges in s are trailing edgcs L the completer 
opt/ration dot's nothing. Otherwise, Jet j,, , . . , i m be the live members of t's 
return aft- Let n* bo the non-terminal in z't whose call list contains i, and 
let ^ be that coll lisc. Finally, let .ijt 1* the stale of ** induced by i"% return 
as per section 1.1.5. Tin.- completer pcrforma the followine, actions for each 

iV 

I. If 1 1 is marked with the R-'Aag, .the completer invokes the r- split upemtion 
on iV, n, i, and Ij, 

I, IT i* has calls other than the one ta t pendilii; for node n, the completer 
invokes the c- split operation on n, n, J, and /j. 

■j If :::(■ completer baa already added to ft a c-iplit i' of t^ whose state part 
is j* and who bas calls outstanding for the same nodes as i k does. 4 the 
completer marks i'i as dead (from a merge} and adds it to Jy. Otherwise, 
the completer changes the state of it to be s^ and adds It to /.-. 
{N.B, Ab in tbe predictor, tbe item i' r menTionnl ;{:iiiv\- may not still be in 
state 3if at the time of tbe current complelkin Intervening completions 
may have changed t"s H-tate, but the completer atUI marks iV as dead.) 



The Algorithm 

First, we construct Jq as follows: 

1. For each rule P^ In tbe grammar which derive* £, Id j t b^ the initial state 
of a recognizer for that rule computed in accordance with section 1.1.5 
and the initial linkage information gpecih'td in the input. Add an item 



Till: restriction ikj caJl; merely leflrcta the fact that, unl ikf- tb<- tittle vi string i(«rna. the 
aliilr i)f ^:i|'L itfjiia don not irflrct which nLM-trrnuiinJa rih- h*-iuj[ diTJiitl . Thus. wJi-ld 
■T'lrihiil'iuju wbi'tluT hi ningp gmpb item*, tbb part (if tJitir- 'St4tr' iimal be ■djwiwJ 
Mpwatvlp Nc(« Chat ancjr which noiin tare oiititij.ii3i.iLg. -mlli. i;i^i,rfrn Icir. not '.he 
iXtUii to which (Jim* caJla mrt nude. 
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tr /„ wli«!W ^tjtr<> pari is .- tl wIilim- rail ii*T in rtlipty. and whose return 
art is empty. 

2, Ptnn the prc-dirror «li i^ery item in In. If thi* add* dew ih-ms to h. n™ 
ttw predictor on Ik-Hi, wd repeal this pmw* «»cd no ]»■* item* aw 
addrdr 

Next, we mifcciEiivdy conatrtLft li. . . . . I n . CAv*it /u, ^ - i: *«' construct 

Jj as fctloWS: 

3, Cbue.se an input- uodr rij to n'ad next. (This node mum he to the right 
frlLLsc <jf 1hf rurr^ELC read bead.) 

4, Run the scanner L>ver every item in Ij_;- 

5, Run the completer over every item in Ij. If tin* aids new items to J, , nan 
the completer over them, ajbd repeat tins until no new items arc added. 

6, Run the predictor iwerevesy Lteni in fy. If this addj new item* to Jy, run 
thr predictor over them, and repeat tints- until DO new items are added. 

A little tlmughc will COUVuK e the. reada- that tMa algorithm produce* the 
lnrts given in the examples _ above. A graph is accepted hy thi& algorithm 
if I n contains an item wheat rtH] liats and return set are empty and whose 
state part is Wl accepting state of & rwflgnijer for a rule demine, S- 

11::* ali^rkhm can he converted to produce a parse, using techniques 
exactly likr tho*e presented Ln the last chapter. 



Chapter 5. 
Discussion 



111 this chapter, we disiruns a x'ariety of Issues related to Aliw graphs flow 
grammars- and our parsing aipprirhnk- We include a complexity analysis of 
the algorithm and some gujjRfiatiocLfl fnr related research. 



5.1. Flow Graphs and Grammars 

As was merit kmed ;il tJj*- introduction, flow grnpiia were abstracted from pro- 
grain descriptions caiEed pJan*. The floa] of the abstraction was to preserve 
two structural features of plana: (i) the partial rhrdN-rirjg of operation!, and 
(ii) the naming of inputs and outputs. There Were a number of structural 
features of plans that were left out of ftow graphs;, moat notably *fao omt^ — 
the use of an operation's single output as tie input of more than one- other 
operation. The criteria used to determine which features of plana would be 
preserved in flow graphs grew out of the author's work in program analysis 
and are not. germane here- however, we say more below about how graphic 
representations which include Oilier features may he usefully .manipulated 
by our parsini; algorithm. 

It shuuld be quite clear from the abow that the structure of flow Ri.iphs 
and flow grammars were developed without rrtttrb. regard for graph-theoretic 
concerns., This does not necessarily eih^'OS, rmwevcr. that they are devoid of 
theoretical interest. If we vivw flow grammars as ,£eiieTa3iz.atiOM3 «f string 
grammars which flen^rate partially- as opposed to totally-ordered sentences, 
then the following sorts of questions naturally arise: 

■ Is there a natural definition of a "finite-state flow-graph automaton' 1 7 
It it possible to develop a hierarchy of ^ucli automata analogous, to the 
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Kirhig-anrtomntji ItHTFin'hy? What is tin- relationship Ih^wliu tins hicr- 
arch y ajid tin- string- hieraffhy? 

* In is nibble to dewhui a hiL'rarrby nf fl«w gratiiuiflrs; Ntflntt""* to 
diomnky * type CI u litrins-^uiJijjL-ir hitT^rrby'' In particular, are there 
eanoniraJ -form ri'M.uttfl for thiny sniunnanj. and do fri-f* setR of language* 
generated by mvU grammarc cliffy Juiy uit rrcntkuj H'hjHiare properties? 

• If the" answers to the above questions! are affirmative, can we reflate the 
automaton and grammar hierarchies in a manner ieininistent of the string 
GUt? 

Our research into flow graph parsing, as might be Burinisnl ft»rn the refer- 
ences ine»t"*ned in chapter 1, was initially concerned with questions such 
as tbesr. Wp hoped initially that flow-graph parsing might he r^hicihle to 
conteat-free string parking. While we even r.i i*]]y gave up on achieving such 
a direct connection between the two nt mow believe that flow grammars 
have a stridly greater generalise rapacity than String grammars— the intu- 
itions we developed, to. trw course of thia research seem to indicated Strongly 
that the answers to the questions (riven above are generally amrnjJUJve. 



5.2. Applicability of the Algorithm 

One of the most interesting features 0? inir prising algorithm is its ainenabil- 
ily to 'advice' from ontside sources of knowledge, Sioee the algorithm'* con- 
trol mechanism works: hy consulting and updating and explicit agenda— the 
current item list—it is a relatively easy matter for external agents to control 
or influence the algorithm^ behavior; they need only make alterations hi 
this hat. 

Fyf example j lei IIS consider the fail-out problem mentioned above. Fig- 
ure 5.1 shows a grammar fragment and a pseudo-flow graph which contains 
fan-out. Thi* figure represents the result of * typical program optimisation 
and may he read as follows: 

A MiJ B ar* hiph-Jfwl ctfUTratiulift, A Piny he toLpltllimtrd 
by (jpwalHm u [ViLkfflwd hy operation *, while: B uiuy be EH- 
pJrmfMitEi by ^eiatiQn a fuUtfWfd by cpraaticm e, U I Late 
a piogrnin wliid* prrf«iTOi» tutlh A uid B, I can optimiw th> 
proXTBjn by performing a juat once and Eh™ uaLnn; it? mtput 
u tbr input of both t and c- 



52 AptdirabHity afthr Algorithm 1J7 



A ** -*— b — _^ h Jf-b — 



5 =^ — a — c— 




s 



J 



c 



Fipmc 5-1, Flo* gHUtimpir finff mmt nn«t jiwudo-Si-iw HjrApli. The ^n]th iUnp]ayf 
Fiqi-mit and CMl bt 1 taaiijclf n'H :tn nptinuiftcii>n nf llir twi Qi-(w gTftJjhi Ui'jwrrir-rd by 
the p^tijjinr fragrnrnt. 



We <ftttnot recover this analysis Hinply by parsing, because tin? How graph 
which irprrpcmts the optimises] program can nof be read all the way through 
bv our read-h^ad mediAiusiii . . But consiioVr the .stat* 1 of the par.^r at the 
poluh where iLa read heftd encounters the fun-rant- The grammar fragment 
shown in figure 54 will have given rise to the fallowing two item sleek-tons; 






|rtp4,t : 



-<5L-*- ff 



7> 



A fan-out handler invoked at this, point, basing at* actions on 4 theury of 
program optimisation, through shared operations, might (ij replace ^ in the 
parser's head prwiiiun by e^ and ej, and (jj) change tbsM Ltern frag.tiu.-ntfl to 
read: 



- a+- b — •■ ■ -I 



r-, . i 



1 — b 






Tbese actions would allow the parser, using the ruiej of figure 5.1, to recover 
both the desired analyses for The input, 
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RciuliT-i lu.'iy )<•. ih^ mayed 1 iy rhe.-h.vuuiijr LtLfonnrdity of saH'h a ziolutiou. 
TIb puint lierc IK 1]ULt lmit parsid^ alifii'rirhiii embed* a notion "f il)|HLi stfmr- 
tnrr which d^ 1 * nor take into nmnilH anotwdies due to sLflri]j}{. A domain 
tX]nTt which lindirnntftndj surf] utnif tiirFil FUJduiaJicH can hIujw the ]iftt*t-r tuiw 
to proceed in thcwciw* hy ,ilteriit£ its Hlalr in the fsforeinpntioued way; the 
opi'rn-1 iotLrt themselves nifty sh-iu 1ow-lfvet hut tin- ttn^jry underlying thelu 
is not, 

To put it another way. it is imp oft Ant here not to confuse *rf presentatinD- 
leveF with "low-level" . BeeawM.' the aLKorithnVB, representation* fairly di- 
rectly reflect the grate of chc parser* being simulated, a wide rang? of 
grammar- Find graph- r I ieoretir operation* enjj be implemented dufcrly via 
simple rrprcmaAition-lCTd operation*, In fact, it is- precisely this tlHawai- 
between tin- lluwretk SJ4d rcpreientatioual Itti'ls that makes Quh algorithm 
go easy to advise. 

5.3. Correctness 

We have done no substantive work oa a coricctnes« proof for out parsing aJ- 
garithm. For one thing, such a proof would ideally require definition!! of fkw 
graphs, flow grammars . and the graph-derivation process which are a great 
deal more rigorous than those presented here. For atwther, the algorithm 
it-scll WO"Ed have to he stated quit? * hit mora precisely than we have stated 
it. To readers who might be interested in constructing a correc cness proofs 
we recommend the proof of conwtriesH of Earlry'i algorithm contained in 
[Aho and UlLrnan 1072|. The structure of that proof Would, probably serve 
at a good EHfldfiJ for a correctness proof in this- case. 



5.4. Complexity Analysis 

We conclude by considering the running time of our simulation Algorithm. 
la our analyse, we will he concerned entirely with ordaining a 3rM.i*e upper 
beuitd on worst-cast- time complexity as a (unction of the aumher of nodes 
and edges in the input graph, paying only curwory attention to *pAce com- 
plexity or time/space complexity ae. a function of the input grammar. The 



l hw fatt, W* fltp«* tin! refldrrt Jli* KfCsiniicE "Ouch!" WLat * flitl!" 
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print of thin Ichtm i* twofold: (i) wp jik> hiti T rt.ted B>riijjj|.rily in rapping 
intuit i«nB about tin- Hafm- ™i of Ltno ^guritlnu'H oj unions, raid (ii) we 
will bo nfitirthi'tl tni dww Hint tL*' algorithm di^pla^ polynomial rai)nf than 
exponential titiH- jntwHi with tin' v \h- tuf the input. 

Tin* algorithm upend* all its rfntf, except for ft «*oii*tEint jujumnt At the 
brgiiujiuK, nnmhtu the wanmsr, pn-dictor, and frmjpleter over isi b con- 
structed item- Thu^ the total running limp- of tin- aigrmlhm in rlje prnduct 
of the- number of constructed items and the sum af the times n™ded to 
run each of these three opcrariyfls on an item. Wc will consider each factor 
Separately.' 1 

la wh.it follows, we will be using the following definition*. 

p = the number yf rules in the firanimar, 

w = the maximum number of Inputs, to a grammar rule, 

t = the maximum number of pain in the state of* recognizer, 

ft = the nuutL3::iL33: uumber of edges In a head position, 

« = the mimlur of nlgeE in the input graph, and 

n s the number of nodes in ch,^ input graph. 

In addltjftfj. we will often he malcing an ordered ehoite of k items from, among 
?7i items. There are 



tnJ<tFi-]x,..Xm-{Jt-I) = 



»-"-(:) 



ways of making such a choice; we will denote thi* number by [^, 

The algorithm constructs n + 1 lists. On earn, list, any two items must 
differ in one of (i) the mje they were invoked for, fii] where in the input they 
were invoked, or (iii) their state. Each start position is an orrff red choice of 
at most w edges from the c in the input, and each state is determined by an 
Ordered choice of at most t edges from the current head position; thus, the 
number of items a single lint is bounded dbure by 
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J Tbi3 nnnlTBii ii pattffQfd dirertlr after tbn fivfli b* Earley in [3M8]- 
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Both I-Ik- prvdictur and Chr fOtnpltftrr nerd 1" ktmw wither they have 
already added A pJLrriciuar itc>iai to tin 1 nirrciH lid, ff we flit* eHaurntnl 
only with nuuiiiipt lime (and not h|)fu-«- retuiiFfiiH'irtiil. we ran cipriniLn- this 
operation »« take rtHj9ifJLti.t rime by ken "in K a table erf all nidi ileum wllirh 
i? indexed by the the Ihree faetcn-H mentioned above.* We will iufsnnie tliAl 
such mi optimization is nwnd in tfctir- following juiaJv*is- 

N<kw we wish to bound the titin- it takes bo nm each of the thro* basic 
operations on a given item i from list i r In this analysis. We will "a* an 
anpjmcnted predictor operation that also attaches to each created item a 
unique: integer identifier. When the c-nplit operation aplits such an item, 
thin identifier will be copied to the split Hem, allowing the predictor to tell 
in constant time whether two items arc c-upb'ts of each other. 

The scanner either copies i m simulates a state transition for the recog- 
niser represented by i. This tikes constant time- In addition, the scanner 
has to check each of up to f pair* In the item's state 1 part Co determine 
whether the item is active oil the nodp read. This alsu requires time inde~ 
pendent of the ale? of the input graph- 
When the predictor considers an item * on J ? whoae state contain* mputa 
to target noil-terminal(a), it Cries to add to Ij up to tp item*: one for each 
rule which derives ewh of the non-terminal* whose inputs have been reaehed- 
|n addition, :l may p-sptit the up to [^J] members of its call Est: A figure 
derived below. Givetl the optimisation described above, checking whether 
each of the resulting item* has already been athled to the list take* constant 
time, so the predictor takes up Co [^' time on each item. 

When the completer considers an item i on T 3 whose state part contains 
trailing edges °^ ^ target graph, it must process every item in -i's muni 
set. Thus, the amount of time taken hy the completer is the product of the 
maximum number of items in a return set and the amount of time it takes 
to process each such stem. 

Return pointers in an item ariee from two rcmrce*. Originally, an Item 
]vta one return pointer for each of its calling it<-ins. However, as the parser 
run? and its callers split, an item may contain several re-turn pointers to 
different -splits of erne original caller. We consider first how many original 
callers mt. item can have* And then how many split.* <vwh can turn into, 



3 Huc1l & laldu. ^f ^inifp*. will tend lo be X&J «p*Mr: th* author 'n aclaal smpLemmiaiinii. 
of the ri&tMim U**d i more emmput, tian*t« mil inning rcpmnjIuticiB- 
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Ul 



A given flib-rrrospiistcr It i-au *mty hair b<*cn originally eaUH fruM iff- 
oEnnaiTs at one. of ur head poHitidtLB (our fur eirch tuf its jujmr.ii), This, its 
urigma] fjUJiuK r^'<3Euizs. , rs must have In*™ in one of .V rrnml [ *J states, no 
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* 
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original ^Ailing items- 

Suppuse an instance of a iwogrUBet r (which was called at some Specific 
input position) calls an item i to derive a node n. Every state transition 
on. a nod* Other than n that the calling rrcojrniior takes while i i» running 
might split the falter and add up to p return pointers to a. SiBtc at kaat 
on* ItiptU nf ii rr«] (tot appear, in the state led to by sin:h ;» transition, there 
are: at moat [ ~ ] possible stales, which contribute sptUn, Leading to a total 
return «t meEaWBhip of 



H-]"KH i+ M) 



a calling item i c- or r-splits as the penult of a eallec's return, we 
must add the split-off item to the call Lima oft'* callers and the return Lists 
of fs callee*. This will take as much time ft* t litre are callers and calleea. 
We saw above that how may callers there an-; a similar argument (utilizing 
the symmetry of creation of catt and return pointers) shows that there axe 
up to [^; callces, Thus, splitting an item may take up to the sum of these 
figures., leading to a total coat for the fompU'ter of 
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nth 



*■ 






t-i 



i-l 



Wheo we add the coats of the three operations together, we find that 
tin- coiriplettr dominates the cost of the other operations, *o the tost of the 
entire algorithm « houndrd by the piortuct of the mmpli'lcr cost and the 
total number of items. This product is poLynaiiiial in f; in tie string case it 
reduces to tr = n which is the cost of Earley's algorithm- 
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